datasketch
datasketch copied to clipboard
Top-k with MinHash LSH Ensemble?
Hello, I've seen that MinHash LSH Forest is a variation of MinHash LSH for top-k queries instead of threshold ones. Is it possible to do top-k queries with MinHash LSH Ensemble?
So far LSH Ensemble does not support top-k query natively. However you can always retrieve by threshold first and do a sort-by on the intermediate result, then take the top-k.
@LSparkzwz you might want to consider the frequent items sketch for approximate top-k queries on streams.