ann-benchmarks
ann-benchmarks copied to clipboard
some datasets are missing in https://ann-benchmarks.com/${dataset_name}.hdf5
DATASETS: Dict[str, Callable[[str], None]] = {
"deep-image-96-angular": deep_image,
"fashion-mnist-784-euclidean": fashion_mnist,
"gist-960-euclidean": gist,
"glove-25-angular": lambda out_fn: glove(out_fn, 25),
"glove-50-angular": lambda out_fn: glove(out_fn, 50),
"glove-100-angular": lambda out_fn: glove(out_fn, 100),
"glove-200-angular": lambda out_fn: glove(out_fn, 200),
"mnist-784-euclidean": mnist,
"random-xs-20-euclidean": lambda out_fn: random_float(out_fn, 20, 10000, 100, "euclidean"),
"random-s-100-euclidean": lambda out_fn: random_float(out_fn, 100, 100000, 1000, "euclidean"),
"random-xs-20-angular": lambda out_fn: random_float(out_fn, 20, 10000, 100, "angular"),
"random-s-100-angular": lambda out_fn: random_float(out_fn, 100, 100000, 1000, "angular"),
"random-xs-16-hamming": lambda out_fn: random_bitstring(out_fn, 16, 10000, 100),
"random-s-128-hamming": lambda out_fn: random_bitstring(out_fn, 128, 50000, 1000),
"random-l-256-hamming": lambda out_fn: random_bitstring(out_fn, 256, 100000, 1000),
"random-s-jaccard": lambda out_fn: random_jaccard(out_fn, n=10000, size=20, universe=40),
"random-l-jaccard": lambda out_fn: random_jaccard(out_fn, n=100000, size=70, universe=100),
"sift-128-euclidean": sift,
"nytimes-256-angular": lambda out_fn: nytimes(out_fn, 256),
"nytimes-16-angular": lambda out_fn: nytimes(out_fn, 16),
"word2bits-800-hamming": lambda out_fn: word2bits(out_fn, "400K", "w2b_bitlevel1_size800_vocab400K"),
"lastfm-64-dot": lambda out_fn: lastfm(out_fn, 64),
"sift-256-hamming": lambda out_fn: sift_hamming(out_fn, "sift.hamming.256"),
"kosarak-jaccard": lambda out_fn: kosarak(out_fn),
"movielens1m-jaccard": movielens1m,
"movielens10m-jaccard": movielens10m,
"movielens20m-jaccard": movielens20m,
}
Are all the datasets in this list available for download? I found that some of them are missing.
Checking dataset availability from: ann-benchmarks.com
Total datasets to check: 27
-----------------------------------
Checking: deep-image-96-angular
✅ Available: deep-image-96-angular
Checking: fashion-mnist-784-euclidean
✅ Available: fashion-mnist-784-euclidean
Checking: gist-960-euclidean
✅ Available: gist-960-euclidean
Checking: glove-25-angular
✅ Available: glove-25-angular
Checking: glove-50-angular
✅ Available: glove-50-angular
Checking: glove-100-angular
✅ Available: glove-100-angular
Checking: glove-200-angular
✅ Available: glove-200-angular
Checking: mnist-784-euclidean
✅ Available: mnist-784-euclidean
Checking: random-xs-20-euclidean
❌ Not available: random-xs-20-euclidean
Checking: random-s-100-euclidean
❌ Not available: random-s-100-euclidean
Checking: random-xs-20-angular
✅ Available: random-xs-20-angular
Checking: random-s-100-angular
❌ Not available: random-s-100-angular
Checking: random-xs-16-hamming
❌ Not available: random-xs-16-hamming
Checking: random-s-128-hamming
❌ Not available: random-s-128-hamming
Checking: random-l-256-hamming
❌ Not available: random-l-256-hamming
Checking: random-s-jaccard
❌ Not available: random-s-jaccard
Checking: random-l-jaccard
❌ Not available: random-l-jaccard
Checking: sift-128-euclidean
✅ Available: sift-128-euclidean
Checking: nytimes-256-angular
✅ Available: nytimes-256-angular
Checking: nytimes-16-angular
✅ Available: nytimes-16-angular
Checking: word2bits-800-hamming
✅ Available: word2bits-800-hamming
Checking: lastfm-64-dot
✅ Available: lastfm-64-dot
Checking: sift-256-hamming
✅ Available: sift-256-hamming
Checking: kosarak-jaccard
✅ Available: kosarak-jaccard
Checking: movielens1m-jaccard
❌ Not available: movielens1m-jaccard
Checking: movielens10m-jaccard
✅ Available: movielens10m-jaccard
Checking: movielens20m-jaccard
❌ Not available: movielens20m-jaccard
-----------------------------------
Check completed!