如何在 R 中根据两个不同的字符列查找整数列的总数?
r programmingserver side programmingprogramming更新于 2025/4/12 4:22:17
基于两个不同的字符列计算整数列的总数仅意味着我们需要为可用数据创建一个列联表。为此,我们可以使用 with 和 tapply 函数。例如,如果我们有一个数据框 df,其中包含两个定义为性别和种族的分类列和一个定义为包的整数列,则可以按如下方式创建列联表:
with(df,tapply(Package,list(gender,ethnicity),sum))
示例
考虑下面的数据框 −
set.seed(777) Class<−sample(c("First","Second","Third"),20,replace=TRUE) Group<−sample(c("GP1","GP2","GP3","GP4"),20,replace=TRUE) Rate<−sample(0:10,20,replace=TRUE) df1<−data.frame(Class,Group,Rate) df1
输出
Class Group Rate 1 First GP1 7 2 Second GP2 1 3 Second GP4 1 4 Second GP4 0 5 Third GP2 10 6 Second GP2 8 7 First GP1 7 8 First GP4 4 9 Second GP1 4 10 Third GP3 8 11 Second GP2 8 12 First GP2 4 13 Third GP2 6 14 Third GP4 4 15 Third GP4 5 16 Second GP1 2 17 Second GP1 9 18 Second GP3 2 19 Second GP3 1 20 Third GP4 10
示例
str(df1) 'data.frame': 20 obs. of 3 variables: $ Class: chr "First" "Second" "Second" "Second" ... $ Group: chr "GP1" "GP2" "GP4" "GP4" ... $ Rate : int 7 1 1 0 10 8 7 4 4 8 ...
Finding the total of Rate based on Class and Group −
with(df1,tapply(Rate,list(Class,Group),sum)) GP1 GP2 GP3 GP4 First 14 4 NA 4 Second 15 17 3 1 Third NA 16 8 19
我们来看另一个例子 −
示例
Gender<−sample(c("Male","Female"),20,replace=TRUE) Centering<−sample(c("Yes","No"),20,replace=TRUE) Percentage<−sample(1:100,20) df2<−data.frame(Gender,Centering,Percentage) df2
输出
Gender Centering Percentage 1 Male No 28 2 Male No 89 3 Female Yes 38 4 Male No 78 5 Male Yes 19 6 Female No 46 7 Female Yes 94 8 Male No 4 9 Male Yes 92 10 Male No 90 11 Male Yes 66 12 Female No 57 13 Female No 74 14 Female No 48 15 Female Yes 20 16 Male Yes 51 17 Male No 82 18 Male No 7 19 Male No 53 20 Male No 55
根据性别和中心化计算百分比总和 −
with(df2,tapply(Percentage,list(Gender,Centering),sum)) No Yes Female 225 152 Male 486 228