r - calculate mean for multiple columns in data.frame -


just wondering whether possible calculate means multiple columns using mean function

e.g.

mean(iris[,1]) 

is possible not

mean(iris[,1:4]) 

tried:

mean(iris[,c(1:4)]) 

got error message:

warning message: in mean.default(iris[, 1:4]) : argument not numeric or logical: returning na

i know can use lapply(iris[,1:4],mean) or sapply(iris[,1:4],mean)

try colmeans:

but column must numeric. can add test for larger datasets.

colmeans(iris[sapply(iris, is.numeric)]) sepal.length  sepal.width petal.length  petal.width      5.843333     3.057333     3.758000     1.199333  

benchmark

seems long dplyr , data.table. perhaps can replicate findings veracity.

microbenchmark(   plafort = colmeans(big.df[sapply(big.df, is.numeric)]),   carlos  = colmeans(filter(is.numeric, big.df)),   cdtable = big.dt[, lapply(.sd, mean)],   cdplyr  = big.df %>% summarise_each(funs(mean))   ) #unit: milliseconds #    expr       min        lq     mean    median       uq       max # plafort  9.862934 10.506778 12.07027 10.699616 11.16404  31.23927 #  carlos  9.215143  9.557987 11.30063  9.843197 10.21821  65.21379 # cdtable 57.157250 64.866996 78.72452 67.633433 87.52451 264.60453 #  cdplyr 62.933293 67.853312 81.77382 71.296555 91.44994 182.36578 

data

m <- matrix(1:1e6, 1000) m2 <- matrix(rep('a', 1000), ncol=1) big.df <- as.data.frame(cbind(m2, m), stringsasfactors=f) big.df[,-1] <- lapply(big.df[,-1], as.numeric) big.dt <- as.data.table(big.df) 

Comments

Popular posts from this blog

powershell Start-Process exit code -1073741502 when used with Credential from a windows service environment -

twig - Using Twigbridge in a Laravel 5.1 Package -

c# - LINQ join Entities from HashSet's, Join vs Dictionary vs HashSet performance -