bigmemory
bigmemory copied to clipboard
convert a sparse matrix to a big.matrix.
when I use biglasso,my data was a sparse matrix class of Matrix package,biglasso seems only support a big.matrix,I can not onvert a sparse matrix to a big.matrix. any suggestions? thanks
You can either do bigmemory::as.big.matrix(as.matrix(x))
, which is a quick and dirty solution or use:
Rcpp code:
// [[Rcpp::depends(bigmemory, BH)]]
#include <bigmemory/MatrixAccessor.hpp>
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
void sp2BM(const S4& source,
XPtr<BigMatrix> dest) {
MatrixAccessor<double> macc(*dest);
int _ncol = macc.ncol();
NumericVector _pX = source.slot("x");
NumericVector _pI = source.slot("i");
double *_pI0 = &(_pI[0]);
NumericVector _p = source.slot("p");
NumericVector res(_ncol);
double *it, *lo, *up;
for (int j = 0; j < _ncol; j++) {
lo = &(_pI[_p[j]]);
up = &(_pI[_p[j+1]]);
for (it = lo; it < up; it++) {
macc[j][(int)(*it)] = _pX[it - _pI0];
}
}
}
R code:
spToBM <- function(x, ...) {
res <- bigmemory::big.matrix(x@Dim[1], x@Dim[2], init = 0, ...)
sp2BM(x, res@address)
res
}
Verification:
library(Matrix)
x <- Matrix(0, 100, 100)
x[sample(length(x), 400)] <- 1
test <- spToBM(x)
all.equal(test[], as.matrix(x), check.attributes = FALSE)
This may need further testing.
Interesting, will this feature be implemented soon?
I'm happy to take a pull request for this one. Otherwise, I think I can get to it in the next few weeks.
Any update on this? The feature would be especially useful in cases where the object is too big to run as.matrix() on
I'm a bit sceptical on the use of converting a sparse matrix to a dense big.matrix. Have you any use case for that?
Maybe it would be better to have a sparse.big.matrix. I tried to do some tests in https://github.com/privefl/spBigMatrix but I don't really use sparse matrices. And it would be a lot of work to reimplement algorithms for this new type of data.
On Linux and Windows you can create big matrices backed by sparse files by leaving the init
argument as NULL
. The size of the backing file will depend on the number of pages needed to represent the matrix. Note that, in Linux, you need to verify this with du
rather than ls
since ls
tells you the "virtual" size.
Also, note, if you are running numerically intensive routines that a sparse representation is more performance when the matrix is 99% sparse or more. When the matrix is less than about 90% sparse, a dense representation is faster.