Surprise icon indicating copy to clipboard operation
Surprise copied to clipboard

Parallel Computation of Similarity Matrices

Open gautamramk opened this issue 7 years ago • 7 comments

Hi, I was wondering if it would be feasible to make the computation of similarity matrices run in parallel. This would help speed up the process, utilizing multiple cores for computation.

Reference link for Parallel Programming with CPython: http://cython.readthedocs.io/en/latest/src/userguide/parallelism.html

gautamramk avatar Apr 08 '18 13:04 gautamramk

I guess it would be fairly easy to do with joblib yes, especially since all the similarity metrics are computed with some sort of map / reduce process.

I assume though that you're asking because Spearman computation takes a lot of time (#168 )? I implemented a non-optimized version of Spearman's tau a while ago, and I remember it taking forever to compute. There are probably ways to optimize it (besides parallel computin) but I'm not familiar at all with the details.

NicolasHug avatar Apr 08 '18 13:04 NicolasHug

I didn't ask this for the Spearman computation. Was just asking in general. Would be a very nice feature to have.

gautamramk avatar Apr 10 '18 13:04 gautamramk

Sure, I agree!

NicolasHug avatar Apr 10 '18 13:04 NicolasHug

Can I take this up in my upcoming vacation?

gautamramk avatar Apr 13 '18 12:04 gautamramk

Absolutely

NicolasHug avatar Apr 13 '18 12:04 NicolasHug

Hi gautamramk, I was wondering if you are working on this issue. If not I can take this up.

DibyamAgrawal avatar Jun 25 '18 06:06 DibyamAgrawal

I shall do it, I am getting back to this issue

gautamramk avatar Jun 27 '18 06:06 gautamramk