我有一个很大的数据框,其中有几列变量类(字符、因子和数字).可以使用以下代码再现数据帧的子集:
df <- structure(
list(
id = c("1", "2", "3", "4", "5"),
gender = structure(c(1L, 2L, 1L, 2L, 1L), levels = c("Female", "Male"), class = "factor"),
age = c(78, 64, 79, 98, 82),
score1 = c(-0.019375, -0.025835, -0.029842, -0.029842, -0.027398),
score2 = c(0.0004892, -0.001254932, -0.00135780, -0.00312374, -0.00685426),
score3 = c(-0.05938750, -0.1237563, -0.08442363, -0.09326243, -0.091492836)),
row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))
我想使用scale
函数(base)标准化score1、score2和score3,使用新名称将它们添加到数据框中(在分数前面添加一个z),并将原始分数保留在数据框中.
到目前为止,我已经用下面的代码创建了规范化的分数,但希望使用函数或循环来提高代码的效率,因为我的数据框架还有几个分数需要标准化.
df$zscore1 <- scale(df$score1, center = TRUE, scale = TRUE)
df$zscore2 <- scale(df$score2, center = TRUE, scale = TRUE)
df$zscore3 <- scale(df$score3, center = TRUE, scale = TRUE)
有什么建议可以解决这个问题吗?
编辑:
Sotos的解决方案非常适合我提供的示例.但是,分数列的名称并不像所提供的示例中那样有条理.对此我深表歉意.他们更像这样:
df <- structure(
list(
id = c("1", "2", "3", "4", "5"),
gender = structure(c(1L, 2L, 1L, 2L, 1L), levels = c("Female", "Male"), class = "factor"),
age = c(78, 64, 79, 98, 82),
AD = c(-0.019375, -0.025835, -0.029842, -0.029842, -0.027398),
PD1 = c(0.0004892, -0.001254932, -0.00135780, -0.00312374, -0.00685426),
DEM = c(-0.05938750, -0.1237563, -0.08442363, -0.09326243, -0.091492836)),
row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))
我寻求的输出如下:
df$zAD_2 <- scale(df$AD, center = TRUE, scale = TRUE)
df$zPD1_2 <- scale(df$PD1, center = TRUE, scale = TRUE)
df$zDEM_2 <- scale(df$DEM, center = TRUE, scale = TRUE)