i have data frame
, (i show tail of df
) data frame called conv2
8464 208394_x_at esm1 -1.035878e-01 8468 200858_s_at snord55 -1.034971e-01 8469 200858_s_at snord38b -1.034971e-01 8467 200858_s_at rps8 -1.034971e-01 8472 207381_at rps8 -1.034510e-01 8477 211197_s_at icoslg -1.033752e-01
what want is, whenever there name repeated in second column such rps8
remove lines containg such name except 1 highest absoulte value third column. in example row 8467
removed.
i have done way
for (d in dup){ conv2 <- rbind(conv2, conv[which(conv$symbol == d),][which.max(abs(conv[which(conv$symbol == d),][,3])),]) }
is there better , faster way of doing this?
here base r solution uses "split-apply-combine" methodology.
# split data.frame column 2 mylist <- split(conv2, conv2$col2) # loop through list of data.frames , rbind observations maximum values dfnew <- do.call(rbind, lapply(mylist, function(i) i[which.max(abs(i$col3)),]))
Comments
Post a Comment