elasticsearch-minhash icon indicating copy to clipboard operation
elasticsearch-minhash copied to clipboard

Differences with ES 5+ Built in MinHash Token Filter

Open thunderstumpges opened this issue 6 years ago • 2 comments

It seems that since 5.0, Elasticsearch has included a built-in MinHash Token Filter

What are the differences to this plugin (if any) ? What reason would we have to use one over the other? Will the built-in min-hash work with the dynarank diversity script the same as minhash from the plugin?

Thanks!

thunderstumpges avatar May 02 '18 18:05 thunderstumpges

This is a good question. I would like to know this as well

nishitd avatar Jan 09 '19 05:01 nishitd

I have the same question. To my understanding, this minhash system produces a short signature while one on Elasticsearch official gives a long array of signatures of length 512. Would love to get a confirmation from the author though.

ndenStanford avatar Jan 27 '22 06:01 ndenStanford