i have table this(input):
user_id event timestamp rob business 111111 rob business 222222 mike progress 111111 mike progress 222222 rob progress 000001 mike business 333333 mike progress 444444 lee progress 111111 lee progress 222222
dput
of table:
dput(input) structure(list(user_id = structure(c(3l, 3l, 2l, 2l, 3l, 2l, 2l, 1l, 1l), .label = c("lee", "mike", "rob"), class = "factor"), event = structure(c(1l, 1l, 2l, 2l, 2l, 1l, 2l, 2l, 2l), .label = c("business", "progress"), class = "factor"), timestamp = c(111111, 222222, 111111, 222222, 1, 333333, 444444, 111111, 222222)), .names = c("user_id", "event", "timestamp"), row.names = c(na, -9l), class = "data.frame")
i want know last progress
event before first business
event happens (output):
user_id event timestamp mike progress 222222 rob progress 000001
thanks help!!!
we can try data.table
library(data.table) setdt(df1)[df1[order(as.numeric(timestamp)), if(any(event == "business")) .i[tail(which(cumsum(event == "business")==0),1)], user_id]$v1] # user_id event timestamp #1: rob progress 000001 #2: mike progress 222222
Comments
Post a Comment