R: Ordering one column conditionally on another and partial order value -


i have dataframe of retweets

set.seed(28100)     df <- data.frame(user_id = sample(1:8, 10, replace = true),                  timestamp = sample(1:1000, 10),                  retweet = sample(999:1002, 10, replace=true)) df <- df[with(df, order(retweet, -timestamp)),] df # user_id timestamp retweet # 6        8       513     999 # 9        7       339     999 # 3        3       977    1000 # 2        3       395    1000 # 5        2       333    1000 # 4        5       793    1001 # 1        3       873    1002 # 8        2       638    1002 # 7        4       223    1002 # 10       6        72    1002 

there unique id each retweet. each row want assign rank user according inverse order of chain or retweets. rank should estimate influence of each user: longer chain highest point twitterer. in other words want rank-order each retweet chain based on timestamp , assign higher points retweeted before. if 2 users have posted same retweet @ same time should assign same ranking.

or in df

df$ranking <- c(1,2, 1,2,3, 1, 1,2,3,4) aggregate(ranking~user_id, data=df, sum)  #   user_id ranking # 1       2       5 # 2       3       4 # 3       4       3 # 4       5       1 # 5       6       4 # 6       7       2 # 7       8       1 

using data-table:

library(data.table) setdt(df)[order(-timestamp), ranking2 := seq_len(.n), = retweet] df[, sum(ranking2), keyby = user_id] #    user_id v1 # 1:       2  5 # 2:       3  4 # 3:       4  3 # 4:       5  1 # 5:       6  4 # 6:       7  2 # 7:       8  1 

Comments

Popular posts from this blog

powershell Start-Process exit code -1073741502 when used with Credential from a windows service environment -

twig - Using Twigbridge in a Laravel 5.1 Package -

c# - LINQ join Entities from HashSet's, Join vs Dictionary vs HashSet performance -