如何在 R 中按组标准化 data.table 对象的列?

r programmingserver side programmingprogramming更新于 2025/6/24 6:07:17

要按组标准化 data.table 对象的列,我们可以使用 scale 函数,并为分组列提供 by 函数。

例如,如果我们有一个名为 DT 的 data.table 对象,它包含两列,分别为 G 和 Num,其中 G 是分组列,Num 是数值列,那么我们可以使用以下命令 − 按 G 列标准化 Num。

DT[,"Num":=as.vector(scale(Num)),by=G]

示例 1

考虑以下 data.table 对象 −

library(data.table)
Grp<-sample(c("Male","Female"),20,replace=TRUE)
Response<-round(rnorm(20,5,1.25),2)
DT1<-data.table(Grp,Response)
DT1

创建以下数据框

       Grp Response
 1: Female 5.31
 2: Male   5.20
 3: Female 6.38
 4: Male   4.53
 5: Female 4.90
 6: Female 4.78
 7: Male   3.73
 8: Female 6.19
 9: Male   4.33
10: Male   7.84
11: Male   6.70
12: Female 5.11
13: Male   6.80
14: Male   3.76
15: Male   3.56
16: Male   5.51
17: Female 6.58
18: Female 7.59
19: Male   4.62
20: Female 6.75

要在上述创建的数据框中通过 DT1 中的 Grp 列标准化 Response 列,请将以下代码添加到上述代码片段中 −

library(data.table)
Grp<-sample(c("Male","Female"),20,replace=TRUE)
Response<-round(rnorm(20,5,1.25),2)
DT1<-data.table(Grp,Response)
DT1[,"Response":=as.vector(scale(Response)),by=Grp]
DT1

输出

如果将上述所有代码片段作为单个程序执行,则会生成以下输出 −

     Grp    Response
 1: Female -0.66313371
 2: Male    0.03955265
 3: Female  0.43789692
 4: Male   -0.43061348
 5: Female -1.08502396
 6: Female -1.20850403
 7: Male   -0.99200587
 8: Female  0.24238681
 9: Male   -0.57096158
10: Male    1.89214752
11: Male    1.09216337
12: Female -0.86893383
13: Male    1.16233742
14: Male   -0.97095365
15: Male   -1.11130175
16: Male    0.25709220
17: Female  0.64369704
18: Female  1.68298763
19: Male   -0.36745684
20: Female  0.81862714

示例 2

以下代码片段创建了一个示例数据框 −

Class<-sample(c("I","II","III"),20,replace=TRUE)
Rate<-round(rnorm(20,10,1.02),0)
DT2<-data.table(Class,Rate)
DT2

创建以下数据框

  Class Rate
 1: II  10
 2: III  9
 3: II  10
 4: II  10
 5: III 10
 6: III  9
 7: III  8
 8: II  10
 9: II  11
10: III  9
11: I    9
12: II  11
13: III 13
14: II  10
15: III 12
16: I    8
17: II   9
18: I   10
19: III  9
20: II  10

要在上述创建的数据框中通过 DT2 中的 Class 列标准化 Rate 列,请将以下代码添加到上述代码片段中 −

Class<-sample(c("I","II","III"),20,replace=TRUE)
Rate<-round(rnorm(20,10,1.02),0)
DT2<-data.table(Class,Rate)
DT2[,"Rate":=as.vector(scale(Rate)),by=Class]
DT2

输出

如果将上述所有代码片段作为一个程序执行,则会生成以下输出 −

   Class     Rate
 1: II  -0.18490007
 2: III -0.50669175
 3: II  -0.18490007
 4: II  -0.18490007
 5: III  0.07238454
 6: III -0.50669175
 7: III -1.08576803
 8: II  -0.18490007
 9: II   1.47920052
10: III -0.50669175
11: I    0.00000000
12: II   1.47920052
13: III  1.80961338
14: II  -0.18490007
15: III  1.23053710
16: I   -1.00000000
17: II  -1.84900065
18: I    1.00000000
19: III -0.50669175
20: II  -0.18490007

相关文章