datasketch icon indicating copy to clipboard operation
datasketch copied to clipboard

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW

Results 72 datasketch issues
Sort by recently updated
recently updated
newest added

In API document , we can learn how to find approximate neighbours with Jaccard similarity more than threshold . On the other hand, how to find most unsimilar instance. Further,...

question

Guys, I am desperately looking for the ability to obtain hash values as an array of int(s) based on provided key? Any direction? Thanks in advance, P

question

Does there have any example to remove duplicate docs using MinHash?

help wanted
question

I have a use case where i have to store min hash of (n) different categories of file and the query them. For example if i have documents of category...

How to connect to aws keyspace cassandra as it asks for SSL certificate and service's user name and password ? How to pass it in MinHashLSH's constructor. The way to...

enhancement
help wanted

I wanna delete a index from MinhashLSH forest, but I didn't find "remove" function in forest like that in lsh

Say if we have three documents A, B and C. Each document might contains different words. According to the document of data [sketch.MinHash](http://ekzhu.com/datasketch/documentation.html#minhash), we can get a min-hash for A...

question

Is the return of MinHashLSH.query() in ascend/descend order by Jaccard similarities

question

Hi, is there any plan to provide support for PostGres? - I'm willing to work on that. Thanks!