Optional parallelization of cosine similarity computation (Issue #95)
The PR adds parallel_cosine_similarity to find_duplicates which is set to True by default so that it won't break any existing code.
I have a really huge dataset and not enough RAM to use multiprocessing so introducing the ability to disable parallel computation of cosine similarity was the only way to use the package.
Oh, I haven't changed the the tests...
Related to #95 Also, please fork from the dev branch, not master. Have a look at the contribution guide.
@EduardKononov Do you intend to work on this?
@tanujjain yes, I do, but later because I have no free time at all now. The deadline is mid-January. Hope sooner
Unfortunately, I still have no time to do that. I'm here just to notify that I remember but have no opportunity
@EduardKononov Thanks for the info. Do you think you'll have time in the following weeks? Otherwise, I may have to start working on it sometime in February(2nd/3rd week).
@tanujjain no, I don't think so
Closing since this is being tackled in #185