mathnet-numerics icon indicating copy to clipboard operation
mathnet-numerics copied to clipboard

SVD routine for sparse matrices (to handle huge matrices like 500Kx500K)

Open AlexKNM opened this issue 11 years ago • 6 comments

Hello,

I would like to suggest to implement an SVD routine for sparse matrices. Main motivation is recommender system algorithms: Such algorithms deal with initialization of input matices (which are products of SVD) having huge data array (such as 500 000 x 500 000) of sparse data. The resulting SVD matrices are 4*500Kx500K (Dense). As a result we cannot fit all these data volume in the memory and therefore we cannot use SVD from Math.Net for such sparse matrices.

Few more discussions on the topic were here: https://mathnetnumerics.codeplex.com/discussions/441313

Best regards, Alex.

AlexKNM avatar Apr 24 '13 11:04 AlexKNM

Just for reference, some existing sparse SVD routines with compatible license:

  • http://tedlab.mit.edu/~dr/SVDLIBC/ (C)
  • http://soi.stanford.edu/~rmunk/PROPACK/index.html (Fortran)

cdrnet avatar Apr 24 '13 11:04 cdrnet

Thank you for the reference! (and for the attention as well) C is C and not C#, it seems a lot of testing/debugging will be needed to rewrite the code. I've googled quite a lot and did not find any C# ready libraries that do SVD routine sparse matrices:(

AlexKNM avatar Apr 24 '13 12:04 AlexKNM

It looks like SVDLIBC is a fork of SVDPACKC but only includes one algorithm from it - the one that seems to be the "best" general version if I read this right. It should be easy to port and parallelize, and is BSD licensed.

I'll attempt a port next month, but anyone else feel free to.

cuda avatar May 01 '13 08:05 cuda

Hi, Thank you! I am really looking forward for it! My skills are not so good to contribute here, however I can test the solution in different scales and different number of cores. Best regards, Alex.

AlexKNM avatar May 01 '13 18:05 AlexKNM

@AlexKNM I just picked a new contract and with a vacation coming up, it will be awhile before get to this.

cuda avatar May 09 '13 12:05 cuda

Hello, I am wondering if you have any suggestions for how to handle sparse matrix SVDs. I know this issue has been open for a while, and I'm just looking for any more modern ways to handle it (preferably in C# or .NET compatible).

The SVD calculation in Math.NET does not seem to handle sparse matrices in any performant way, so I've been using dense matrices, but it's taking too much memory for empty space (99.99% 0's). For reference, my test-run matrix is approximately 20,000x20,000 doubles.

Control.UseNativeMKL(); Control.MaxDegreeOfParallelism = Environment.ProcessorCount; // # of logical cores MathNet.Numerics.LinearAlgebra.Factorization.Svd<double> svd = myMatrix.Svd(true); // <-- this line is slow if myMatrix is Sparse. Runs fine if it's Dense.

Thank you!

allanchao avatar Sep 17 '16 00:09 allanchao