RcppEigen icon indicating copy to clipboard operation
RcppEigen copied to clipboard

setFromTriplets function gets segfault error

Open yanzhaobiomath opened this issue 1 year ago • 5 comments

Hello, I'm using setFromTriplets function to cast a triplet into a sparse matrix which is large. I initialized the sparse matrix with certain dimensions, and reserved the memory for it. But I still get segfault error, saying 'memory not mapped'.

The matrix is 80000*80000 with 50% non-zero elements. Do you have any hint about the reason?

Best, Yan

yanzhaobiomath avatar Mar 15 '24 15:03 yanzhaobiomath

We do not have a hint, and cannot do anything with a reproducible example.

Also keep in mind that R has limits on vector length. That does affect the Matrix package and its sparse matrix representation.

eddelbuettel avatar Mar 15 '24 15:03 eddelbuettel

Sorry, the file of triplets is too large to be uploaded here, therefore here I give an example which generates a triplet list and also undergoes the same problem:

#include <RcppEigen.h>
#include <string>
#include <Eigen/Dense>
using namespace Rcpp;


// [[Rcpp::depends(RcppEigen)]]

// ' @export
// [[Rcpp::export]]
Eigen::SparseMatrix<double> ComputeSNNasym(int n) {
    

    typedef Eigen::Triplet<double> Trip;
    std::vector<Trip> trp;

    int idx = 0;
    Eigen::VectorXi xcol(n);

    double overlapping = 0.35;

    int a = n/3*2;
    
     for (int i = 0; i < n; ++i){  //number of columns ?
        
        int id = 0;
        for (int j = 0; j < n; ++j){  // Iterate over rows

            int k = i-j;
            
            if(abs(k) < a){
                trp.push_back(Trip(j,
                               i,
                               overlapping));;
                idx++;
                if(idx == 2147483647){
                    Rcpp::Rcout << "overflowing..." << std::endl;
                }
                id++;
            }
        }
        xcol[i] = id;
    }
     Rcpp::Rcout << "overlapping is done..." << std::endl;


     double sp = (idx+0.0)/n/n;
     Rcpp::Rcout << "number of non-zeros is " << idx << std::endl;
     Rcpp::Rcout << "sparsity is " << sp << std::endl;


    Eigen::SparseMatrix<double> res(n, n);
    Rcpp::Rcout << "initialization is done..." << std::endl;
    res.reserve(xcol);
    Rcpp::Rcout << "reservation is done..." << std::endl;
    
    res.setFromTriplets(trp.begin(), trp.end());
     Rcpp::Rcout << "sparse is done..." << std::endl;

    return res;
}

You may get the same error when you call this function from R with setting n=80500.

I checked the number of nonzero elements in the matrix, it is more than 2^31, so it's overflowing.

But I got this segfault error in the step of setfromtriplets, which is within the cpp script. I'm trying to figure out the reason. Is RcppEigen library also limited by the vector length of 2^31?

yanzhaobiomath avatar Mar 17 '24 20:03 yanzhaobiomath

I am not sure if Eigen is limited but I can assure that R is. We are having the exact same issue in another projects -- the problem is as best as I can tell due to <i,j,x> vectors in standard COO notation. When forming a sparseMatrix object using the Matrix package, then the integer indexing for the vectors i, j and x is the constraint: the 2^31-1 you are aware of. A first guess would be that Eigen has the same problem.

So I am afraid I have no real fix to offer here.

eddelbuettel avatar Mar 17 '24 20:03 eddelbuettel

I know the spam64 package on CRAN explicitly choses a different (64-bit) index type to have larger vectors, but that of course is not integrated with Eigen.

eddelbuettel avatar Mar 17 '24 21:03 eddelbuettel

I see, thanks for sharing your experience. I will try to find a way to walk around this issue.

yanzhaobiomath avatar Mar 17 '24 22:03 yanzhaobiomath