python - Renaming certain multiple values in a column of dataframe into another single value -


i have data frame, 1 gb in size, following dummy one

df <- data.frame(group=rep(c("a", "b", "c","d","e","f","g","h"), each=4),height=sample(100:150, 16)) df    group height 1         105 2         119 3      b    108 4      b    114 5      c    109 6      c    111 7      d    148 8      d    121 9      e    133 10     e    101 11     f    143 12     f    135 13     g    147 14     g    141 15     h    150 16     h    145 

and aiming change names of column group example b, h, , g nc , pc, , others non , tried following one-liner.

de=c("b") df =df$group[df$group %in% de,]<-"nc" 

but it's throwing following error,

error in `[<-.factor`(`*tmp*`, df$group %in% de, , value = "nc") :    incorrect number of subscripts on matrix in addition: warning message: in `[<-.factor`(`*tmp*`, df$group %in% de, , value = "nc") :   invalid factor level, na generated 

in end, data frame df should this

df    group height 1      pc    105 2      pc    119 3      nc    108 4      nc   114 5      non    109 6      non    111 7      non    148 8      non    121 9      non    133 10     non    101 11     non    143 12     non   135 13     nc    147 14     nc    141 15     nc    150 16     nc    145 

any suggestion in r or pandas great. thank you

in r can try:

transform character first , replace value directly.

df$group <- as.character(df$group);  df$group[df$group %in% c("b")] <- "nc" 

edit:

as updated question can try ifelse. of course can overwrite group column approach.

df$group2 <- ifelse( df$group %in% c("b", "h", "g"), "nc", ifelse(df$group %in% c("a"), "pc", "non")) head(df, 10)    group height group2 1         139     pc 2         114     pc 3         132     pc 4         141     pc 5      b    107     nc 6      b    101     nc 7      b    122     nc 8      b    129     nc 9      c    100    non 10     c    108    non 

Comments