BPCells icon indicating copy to clipboard operation
BPCells copied to clipboard

`svds` behaves differently when `storage_order` is "row"

Open ycli1995 opened this issue 2 years ago • 1 comments

Hi, @bnprks. I followed the BPCells tutorials but computed SVD using svds() instead of irlba(). For example, in the RNA part:

svd <- svds(mat_norm, k=50)
pca <- multiply_cols(svd$v, svd$d)
cat(sprintf("PCA dimensions: %s\n", toString(dim(pca))))

I got the same results as irlba:

PCA dimensions: 2768, 50

However, when it came to ATAC's svd_atac, I got this:

svd_atac <- svds(mat_lsi_norm, 10)
pca_atac <- multiply_cols(svd_atac$v, svd_atac$d)
cat(sprintf("PCA dimensions: %s\n", toString(dim(pca_atac))))
PCA dimensions: 211807, 10

In other word, it seemed that the svd_atac$v and svd_atac$u were actually swapped, compared to RNA's svd$v and svd$u. When I check the input matrix, I found storage_order(mat_lsi_norm) was "row" while storage_order(mat_norm) was "col" as most scRNA-seq matrix in R would be.

Do you think BPCells::svds() should give u and v consistently regardless of storage_order, or should we just leave the users to check storage_order after calling it?

ycli1995 avatar Nov 10 '23 08:11 ycli1995

Thanks for pointing this out -- definitely a bug in BPCells and not intended behavior. This should be fixed with the above commit.

The underlying cause here is that the C++ code treats everything as a column-major matrix, so row-major matrices need some special treatment to ensure conversion works properly

bnprks avatar Nov 15 '23 04:11 bnprks