elasticsearch-minhash
elasticsearch-minhash copied to clipboard
Differences with ES 5+ Built in MinHash Token Filter
It seems that since 5.0, Elasticsearch has included a built-in MinHash Token Filter
What are the differences to this plugin (if any) ? What reason would we have to use one over the other? Will the built-in min-hash work with the dynarank diversity script the same as minhash from the plugin?
Thanks!
This is a good question. I would like to know this as well
I have the same question. To my understanding, this minhash system produces a short signature while one on Elasticsearch official gives a long array of signatures of length 512. Would love to get a confirmation from the author though.