R: Ordering one column conditionally on another and partial order value -
i have dataframe of retweets
set.seed(28100) df <- data.frame(user_id = sample(1:8, 10, replace = true), timestamp = sample(1:1000, 10), retweet = sample(999:1002, 10, replace=true)) df <- df[with(df, order(retweet, -timestamp)),] df # user_id timestamp retweet # 6 8 513 999 # 9 7 339 999 # 3 3 977 1000 # 2 3 395 1000 # 5 2 333 1000 # 4 5 793 1001 # 1 3 873 1002 # 8 2 638 1002 # 7 4 223 1002 # 10 6 72 1002
there unique id each retweet
. each row want assign rank user according inverse order of chain or retweets. rank should estimate influence of each user: longer chain highest point twitterer. in other words want rank-order each retweet chain based on timestamp
, assign higher points retweeted before. if 2 users have posted same retweet @ same time should assign same ranking.
or in df
df$ranking <- c(1,2, 1,2,3, 1, 1,2,3,4) aggregate(ranking~user_id, data=df, sum) # user_id ranking # 1 2 5 # 2 3 4 # 3 4 3 # 4 5 1 # 5 6 4 # 6 7 2 # 7 8 1
using data-table:
library(data.table) setdt(df)[order(-timestamp), ranking2 := seq_len(.n), = retweet] df[, sum(ranking2), keyby = user_id] # user_id v1 # 1: 2 5 # 2: 3 4 # 3: 4 3 # 4: 5 1 # 5: 6 4 # 6: 7 2 # 7: 8 1
Comments
Post a Comment