如何选择 R 数据框中不在向量中的列?
r programmingserver side programmingprogramming更新于 2025/6/27 7:07:17
R 数据框可以包含很多列,我们可能希望选择除少数列之外的列。在这种情况下,最好通过取消选择不需要的列来提取列,而不是选择我们需要的列,因为所需的列数多于不需要的列数。借助 ! 符号和单个方括号,可以轻松完成此操作。
示例
考虑以下数据框 −
> Age<-sample(20:50,20) > Gender<-rep(c("Male","Female"),times=10) > Salary<-sample(20000:40000,20) > ID<-1:20 > Education<-rep(c("Grad","Post-Grad","PhD","Highschool"),times=5) > Experience<-sample(1:5,20,replace=TRUE) > df<-data.frame(ID,Gender,Age,Salary,Experience,Education) > df
输出
ID Gender Age Salary Experience Education 1 1 Male 35 38245 4 Grad 2 2 Female 43 21995 5 Post-Grad 3 3 Male 21 38941 4 PhD 4 4 Female 26 36599 2 Highschool 5 5 Male 50 27477 2 Grad 6 6 Female 28 25281 2 Post-Grad 7 7 Male 33 20310 4 PhD 8 8 Female 24 30171 2 Highschool 9 9 Male 38 28779 3 Grad 10 10 Female 46 31213 3 Post-Grad 11 11 Male 36 27697 4 PhD 12 12 Female 41 36929 2 Highschool 13 13 Male 42 35367 2 Grad 14 14 Female 29 28711 1 Post-Grad 15 15 Male 22 29253 3 PhD 16 16 Female 30 28982 5 Highschool 17 17 Male 39 39458 4 Grad 18 18 Female 27 31891 2 Post-Grad 19 19 Male 48 29931 2 PhD 20 20 Female 31 34817 2 Highschool
选择除性别和年龄之外的列 −
> df[,!names(df)%in%c("Gender","Age")]
输出
ID Salary Experience Education 1 1 38245 4 Grad 2 2 21995 5 Post-Grad 3 3 38941 4 PhD 4 4 36599 2 Highschool 5 5 27477 2 Grad 6 6 25281 2 Post-Grad 7 7 20310 4 PhD 8 8 30171 2 Highschool 9 9 28779 3 Grad 10 10 31213 3 Post-Grad 11 11 27697 4 PhD 12 12 36929 2 Highschool 13 13 35367 2 Grad 14 14 28711 1 Post-Grad 15 15 29253 3 PhD 16 16 28982 5 Highschool 17 17 39458 4 Grad 18 18 31891 2 Post-Grad 19 19 29931 2 PhD 20 20 34817 2 Highschool
选择除 ID 和 Education 之外的列 −
> df[,!names(df)%in%c("ID","Education")]
输出
Gender Age Salary Experience 1 Male 35 38245 4 2 Female 43 21995 5 3 Male 21 38941 4 4 Female 26 36599 2 5 Male 50 27477 2 6 Female 28 25281 2 7 Male 33 20310 4 8 Female 24 30171 2 9 Male 38 28779 3 10 Female 46 31213 3 11 Male 36 27697 4 12 Female 41 36929 2 13 Male 42 35367 2 14 Female 29 28711 1 15 Male 22 29253 3 16 Female 30 28982 5 17 Male 39 39458 4 18 Female 27 31891 2 19 Male 48 29931 2 20 Female 31 34817 2
选择除 ID、Education和Gender之外的列−
> df[,!names(df)%in%c("ID","Education","Gender")]
输出
Age Salary Experience 1 35 38245 4 2 43 21995 5 3 21 38941 4 4 26 36599 2 5 50 27477 2 6 28 25281 2 7 33 20310 4 8 24 30171 2 9 38 28779 3 10 46 31213 3 11 36 27697 4 12 41 36929 2 13 42 35367 2 14 29 28711 1 15 22 29253 3 16 30 28982 5 17 39 39458 4 18 27 31891 2 19 48 29931 2 20 31 34817 2
选择除 ID、年龄和性别之外的列 −
> df[,!names(df)%in%c("ID","Age","Gender")]
Salary Experience Education 1 38245 4 Grad 2 21995 5 Post-Grad 3 38941 4 PhD 4 36599 2 Highschool 5 27477 2 Grad 6 25281 2 Post-Grad 7 20310 4 PhD 8 30171 2 Highschool 9 28779 3 Grad 10 31213 3 Post-Grad 11 27697 4 PhD 12 36929 2 Highschool 13 35367 2 Grad 14 28711 1 Post-Grad 15 29253 3 PhD 16 28982 5 Highschool 17 39458 4 Grad 18 31891 2 Post-Grad 19 29931 2 PhD 20 34817 2 Highschool