elastik-nearest-neighbors
elastik-nearest-neighbors copied to clipboard
Use standard benchmarking for results
See https://github.com/erikbern/ann-benchmarks
This is an interesting thing I've thought about a little and I'm not sure it's actually a good idea to use the exact same metrics, but I might be wrong. Here's my reasoning: the plugin is designed to handle many parallel requests across (theoretically) arbitrarily many elasticsearch nodes. In contrast, as far as I can tell, Erik's benchmarks measure raw serial speed (queries per second IIRC). Comparing recall is in principle fine, but my LSH implementation is very simplistic in the scheme of things and an identical implementation could be evaluated without all of the elasticsearch orchestration.