Surprise Parallel Computation of Similarity Matrices

Hi, I was wondering if it would be feasible to make the computation of similarity matrices run in parallel. This would help speed up the process, utilizing multiple cores for computation.

Reference link for Parallel Programming with CPython: http://cython.readthedocs.io/en/latest/src/userguide/parallelism.html

Apr 08 '18 13:04 gautamramk

I guess it would be fairly easy to do with joblib yes, especially since all the similarity metrics are computed with some sort of map / reduce process.

I assume though that you're asking because Spearman computation takes a lot of time (#168 )? I implemented a non-optimized version of Spearman's tau a while ago, and I remember it taking forever to compute. There are probably ways to optimize it (besides parallel computin) but I'm not familiar at all with the details.

Apr 08 '18 13:04 NicolasHug

I didn't ask this for the Spearman computation. Was just asking in general. Would be a very nice feature to have.

Apr 10 '18 13:04 gautamramk

Sure, I agree!

Apr 10 '18 13:04 NicolasHug

Can I take this up in my upcoming vacation?

Apr 13 '18 12:04 gautamramk

Absolutely

Apr 13 '18 12:04 NicolasHug

Hi gautamramk, I was wondering if you are working on this issue. If not I can take this up.

Jun 25 '18 06:06 DibyamAgrawal

I shall do it, I am getting back to this issue

Jun 27 '18 06:06 gautamramk