imagededup Optional parallelization of cosine similarity computation (Issue #95)

The PR adds parallel_cosine_similarity to find_duplicates which is set to True by default so that it won't break any existing code. I have a really huge dataset and not enough RAM to use multiprocessing so introducing the ability to disable parallel computation of cosine similarity was the only way to use the package.

Oct 13 '20 16:10 EduardKononov

Oh, I haven't changed the the tests...

Oct 13 '20 16:10 EduardKononov

Related to #95 Also, please fork from the dev branch, not master. Have a look at the contribution guide.

Nov 17 '20 12:11 tanujjain

@EduardKononov Do you intend to work on this?

Dec 01 '20 14:12 tanujjain

@tanujjain yes, I do, but later because I have no free time at all now. The deadline is mid-January. Hope sooner

Dec 01 '20 14:12 EduardKononov

Unfortunately, I still have no time to do that. I'm here just to notify that I remember but have no opportunity

Jan 15 '21 19:01 EduardKononov

@EduardKononov Thanks for the info. Do you think you'll have time in the following weeks? Otherwise, I may have to start working on it sometime in February(2nd/3rd week).

Jan 15 '21 19:01 tanujjain

@tanujjain no, I don't think so

Jan 15 '21 19:01 EduardKononov

Closing since this is being tackled in #185

Dec 28 '22 11:12 tanujjain