datasketch
datasketch copied to clipboard
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
In API document , we can learn how to find approximate neighbours with Jaccard similarity more than threshold . On the other hand, how to find most unsimilar instance. Further,...
Guys, I am desperately looking for the ability to obtain hash values as an array of int(s) based on provided key? Any direction? Thanks in advance, P
Does there have any example to remove duplicate docs using MinHash?
I have a use case where i have to store min hash of (n) different categories of file and the query them. For example if i have documents of category...
How to connect to aws keyspace cassandra as it asks for SSL certificate and service's user name and password ? How to pass it in MinHashLSH's constructor. The way to...
I wanna delete a index from MinhashLSH forest, but I didn't find "remove" function in forest like that in lsh
Say if we have three documents A, B and C. Each document might contains different words. According to the document of data [sketch.MinHash](http://ekzhu.com/datasketch/documentation.html#minhash), we can get a min-hash for A...
Is the return of MinHashLSH.query() in ascend/descend order by Jaccard similarities
Hi, is there any plan to provide support for PostGres? - I'm willing to work on that. Thanks!