ann-benchmarks
ann-benchmarks copied to clipboard
Support 'logit' scale for x-axis
The matplotlib logit
scale is very nice when plotting values between 0 and 1. In particular, this fixes the current problem of plots being incomprehensible in the "high recall" regime.
I suggest changing the --x-log
and --y-log
parameters from plot.py
to --x-scale
and --y-scale
taking parameters for plugging directly into matplotlib
. This will also support the symlog
scale.
I have a commit with this change here: https://github.com/thomasahle/ann-benchmarks/commit/3bf07f41715a8cfe64b4c7c7bfdaa46027b2892b
But will that work when the accuracy is exactly 1 or 0? Won't those be projected to +-infinity?
Do you have a screenshot you can show?
I haven't run all the algos, but I do think the logit plot makes things easier to read:
In particular, I think for a plot like http://ann-benchmarks.com/glove-25-angular_10_angular.html that is currently incomprehensible.
You are right that the brute force algorithm with its recall=1 is a problem... Perhaps one could make a custom xscale that adds 0 and 1 to logit by surgery...
I think something that warps [0, 1] using roughly x^10
might also be OK. That would project 0.9 to 0.35 and 0.99 to 0.9
Not sure how hard it is to implement a custom scale. I looked at it a while ago and it seemed like a fair amount of boiler plate
It turns out to be pretty easy in matplotlib 3.3 - and upgrading doesn't break anything in ann-benchmarks.
alpha = 3
ax.set_xscale('function', functions=(
(lambda x: 1-(1-x)**(1/alpha)),
(lambda x: 1-(1-x)**alpha)))
This produces the following output:
Varying alpha can "stretch" the high-recall regime more or less.
It might be nice to fix the ticks as well, so there isn't a big blank space.
This is the standard linear scale for comparison.
Nice! I think we might want to throw in minor gridlines, but looks pretty good!!
This is alpha=2.
With minor grid lines:
With evenly spaced major gridlines:
(not sure the minor gridlines make sense here, since you can't guess what values they represent.)
Are your result hdf5 files available for download? Then I could check what looks better for the full plot (without having to run days of benchmarks myself :))
Hi Thomas
You can access the groundtruth for glove-100-angular via aws s3 cp --recursive s3://ann-benchmarks.com/results/2020-07-13/glove-100-angular/ .
.
I would also like to see it with a bit more algorithms. I honestly don't know how important it is to put that much focus on the high recall regime. I think the linear-scale is still easier to interpret, but I'm clearly biased after having looked at these plots for years :-)
Hi Martin, I get fatal error: Unable to locate credentials
.
I see, I hope this will work: http://ann-benchmarks.com/results/glove-100-angular.zip
It somehow doesn't find those files. It seems to be because they don't have the suffix hdf5. Maybe my ann-benchmarks version is too new?
E.g. in the glove-100-angular.zip
zip file I find glove-100-angular/10/puffinn/angular_1073741824_fht_crosspolytope_0_1
, but when I run ann-benchmarks myself, it generates files like glove-100-angular-mine/10/scann/1000_0_2_1_1_10.hdf5
.
Oh, right. That was a recent change. Just rename all of them to contain the suffix.
Alright, here you go. Certainly a bit more uniform in terms of where "the action" is in the plot. If you upload the data, I can test glove-25 too, on which the effect should be even stronger.
I can do that tomorrow (or Erik might be able to tell you how to get s3 to work ;-))
Alright, here is alpha 3, 2, 1 (same as linear) and logit for glove-25.
The plots make it clear that there is a lot of difference between the algorithms in the high recall regime.
I probably prefer alpha=3 here, but alpha=2 may be the best compromise for all the datasets.
One could also consider having it as a user-controlled option in the interactive plots.
For now I'm just suggesting adding the option to the plot.py
script.
I also tried doing alpha=4 with axis labels looking more like the logit plot:
Nice, I think this looks really great. Happy to merge!
Any chance of getting this actually on annbenchmarks.com? :-)
Sure, please update create_website.py
to use these plots as well.
Would be great to have good defaults in the website create script.
I guess it's time to rerun the benchmarks soon!