datasketch icon indicating copy to clipboard operation
datasketch copied to clipboard

Speed up MinHash and LSH using One-Permutation Hashing

Open ekzhu opened this issue 5 years ago • 0 comments

One-Permutation hashing seems to speed up MinHash creation without loosing much accuracy.

Related papers: Lazo, FLASH.

We can try this out. However this really depends on the accuracy-speed trade off. Also I would put this as lower priority comparing to #109 due to memory being more important for big data analytics.

ekzhu avatar Jan 21 '20 19:01 ekzhu