i have data frame, 1 gb in size, following dummy one
df <- data.frame(group=rep(c("a", "b", "c","d","e","f","g","h"), each=4),height=sample(100:150, 16)) df group height 1 105 2 119 3 b 108 4 b 114 5 c 109 6 c 111 7 d 148 8 d 121 9 e 133 10 e 101 11 f 143 12 f 135 13 g 147 14 g 141 15 h 150 16 h 145
and aiming change names of column group example b, h, , g nc , pc, , others non , tried following one-liner.
de=c("b") df =df$group[df$group %in% de,]<-"nc"
but it's throwing following error,
error in `[<-.factor`(`*tmp*`, df$group %in% de, , value = "nc") : incorrect number of subscripts on matrix in addition: warning message: in `[<-.factor`(`*tmp*`, df$group %in% de, , value = "nc") : invalid factor level, na generated
in end, data frame df should this
df group height 1 pc 105 2 pc 119 3 nc 108 4 nc 114 5 non 109 6 non 111 7 non 148 8 non 121 9 non 133 10 non 101 11 non 143 12 non 135 13 nc 147 14 nc 141 15 nc 150 16 nc 145
any suggestion in r or pandas great. thank you
in r can try:
transform character first , replace value directly.
df$group <- as.character(df$group); df$group[df$group %in% c("b")] <- "nc"
edit:
as updated question can try ifelse
. of course can overwrite group
column approach.
df$group2 <- ifelse( df$group %in% c("b", "h", "g"), "nc", ifelse(df$group %in% c("a"), "pc", "non")) head(df, 10) group height group2 1 139 pc 2 114 pc 3 132 pc 4 141 pc 5 b 107 nc 6 b 101 nc 7 b 122 nc 8 b 129 nc 9 c 100 non 10 c 108 non
Comments
Post a Comment