r - Sum of all pairwise row products as a two way matrix -


i'm looking @ high throughput gene data , doing type of correlation analysis based on bayesian statistics. 1 of things need find every pairwise combination of products in dataset , find sum of each resultant row.

so example, high throughput dataset matrix dataset

(dataset <- structure(list(`condition 1` = c(1l, 3l, 2l, 2l), `condition 2` = c(2l, 1l, 7l, 2l), `condition 3` = c(4l, 1l, 2l, 5l)), .names = c("condition 1", "condition 2", "condition 3"), class = "data.frame", row.names = c("gene a", "gene b", "gene c", "gene d")))        condition 1 condition 2   condition 3 gene           1           2             4 gene b           3           1             1 gene c           2           7             2 gene d           2           2             5 

first want multiply every possible pair of rows following matrix called comb:

              condition 1 condition 2 condition 3 gene gene           1           4           9 gene gene b           3           2           4 gene gene c           2          14           8 gene gene d           2           4          20 gene b gene b           9           1           1 gene b gene c           6           7           2 gene b gene d           6           2           5 gene c gene c           4          49           4 gene c gene d           4          14          10 gene d gene d           4           4          25 

after want find row sums each product , sums in form of matrix (which call combsums):

            gene       gene b      gene c      gene d  gene          na           10          24          26 gene b          10           na          15          13 gene c          24           15          na          28 gene d          26           13          28          na 

when tried it, best come

combs <- combn(seq_len(nrow(dataset)), 2) comb <- dataset[combs[1,], ] * dataset[combs[2,], ] rownames(comb) <- apply(combn(rownames(comb), 2), 2, paste, collapse = " ") combsums <- rowsums(comb) 

which gives me sums list, such below:

                    [1,] gene gene b       10 gene gene c       24  gene gene d       26  gene b gene c       15 gene b gene d       13 gene c gene d       28 

unfortunately, want two-way matrix , not list doesn't quite work, if suggest way sums matrix, great help.

if speed important factor (e.g. if you're processing huge matrix), might find rcpp implementation helpful. fills upper triangular portion of matrix.

library(rcpp) cppfunction(  "numericmatrix josilberrcpp(numericmatrix x) {    const int nr = x.nrow();    const int nc = x.ncol();    numericmatrix y(nr, nr);    (int col=0; col < nc; ++col) {     (int i=0; < nr; ++i) {       (int j=i; j < nr; ++j) {         y(i, j) += x(i, col) * x(j, col);       }     }    }    return y; }") josilberrcpp(as.matrix(dataset)) #      [,1] [,2] [,3] [,4] # [1,]   21    9   24   26 # [2,]    0   11   15   13 # [3,]    0    0   57   28 # [4,]    0    0    0   33 

benchmarking provided in other answer. note benchmarking not include compile time using cppfunction, can quite significant. therefore implementation useful large inputs or when need use function many times.


Comments

Popular posts from this blog

powershell Start-Process exit code -1073741502 when used with Credential from a windows service environment -

twig - Using Twigbridge in a Laravel 5.1 Package -

c# - LINQ join Entities from HashSet's, Join vs Dictionary vs HashSet performance -