ann-benchmarks icon indicating copy to clipboard operation
ann-benchmarks copied to clipboard

some datasets are missing in https://ann-benchmarks.com/${dataset_name}.hdf5

Open zhuwenxing opened this issue 10 months ago • 0 comments

DATASETS: Dict[str, Callable[[str], None]] = {
    "deep-image-96-angular": deep_image,
    "fashion-mnist-784-euclidean": fashion_mnist,
    "gist-960-euclidean": gist,
    "glove-25-angular": lambda out_fn: glove(out_fn, 25),
    "glove-50-angular": lambda out_fn: glove(out_fn, 50),
    "glove-100-angular": lambda out_fn: glove(out_fn, 100),
    "glove-200-angular": lambda out_fn: glove(out_fn, 200),
    "mnist-784-euclidean": mnist,
    "random-xs-20-euclidean": lambda out_fn: random_float(out_fn, 20, 10000, 100, "euclidean"),
    "random-s-100-euclidean": lambda out_fn: random_float(out_fn, 100, 100000, 1000, "euclidean"),
    "random-xs-20-angular": lambda out_fn: random_float(out_fn, 20, 10000, 100, "angular"),
    "random-s-100-angular": lambda out_fn: random_float(out_fn, 100, 100000, 1000, "angular"),
    "random-xs-16-hamming": lambda out_fn: random_bitstring(out_fn, 16, 10000, 100),
    "random-s-128-hamming": lambda out_fn: random_bitstring(out_fn, 128, 50000, 1000),
    "random-l-256-hamming": lambda out_fn: random_bitstring(out_fn, 256, 100000, 1000),
    "random-s-jaccard": lambda out_fn: random_jaccard(out_fn, n=10000, size=20, universe=40),
    "random-l-jaccard": lambda out_fn: random_jaccard(out_fn, n=100000, size=70, universe=100),
    "sift-128-euclidean": sift,
    "nytimes-256-angular": lambda out_fn: nytimes(out_fn, 256),
    "nytimes-16-angular": lambda out_fn: nytimes(out_fn, 16),
    "word2bits-800-hamming": lambda out_fn: word2bits(out_fn, "400K", "w2b_bitlevel1_size800_vocab400K"),
    "lastfm-64-dot": lambda out_fn: lastfm(out_fn, 64),
    "sift-256-hamming": lambda out_fn: sift_hamming(out_fn, "sift.hamming.256"),
    "kosarak-jaccard": lambda out_fn: kosarak(out_fn),
    "movielens1m-jaccard": movielens1m,
    "movielens10m-jaccard": movielens10m,
    "movielens20m-jaccard": movielens20m,
}

Are all the datasets in this list available for download? I found that some of them are missing.

Checking dataset availability from: ann-benchmarks.com
Total datasets to check: 27
-----------------------------------
Checking: deep-image-96-angular
✅ Available: deep-image-96-angular
Checking: fashion-mnist-784-euclidean
✅ Available: fashion-mnist-784-euclidean
Checking: gist-960-euclidean
✅ Available: gist-960-euclidean
Checking: glove-25-angular
✅ Available: glove-25-angular
Checking: glove-50-angular
✅ Available: glove-50-angular
Checking: glove-100-angular
✅ Available: glove-100-angular
Checking: glove-200-angular
✅ Available: glove-200-angular
Checking: mnist-784-euclidean
✅ Available: mnist-784-euclidean
Checking: random-xs-20-euclidean
❌ Not available: random-xs-20-euclidean
Checking: random-s-100-euclidean
❌ Not available: random-s-100-euclidean
Checking: random-xs-20-angular
✅ Available: random-xs-20-angular
Checking: random-s-100-angular
❌ Not available: random-s-100-angular
Checking: random-xs-16-hamming
❌ Not available: random-xs-16-hamming
Checking: random-s-128-hamming
❌ Not available: random-s-128-hamming
Checking: random-l-256-hamming
❌ Not available: random-l-256-hamming
Checking: random-s-jaccard
❌ Not available: random-s-jaccard
Checking: random-l-jaccard
❌ Not available: random-l-jaccard
Checking: sift-128-euclidean
✅ Available: sift-128-euclidean
Checking: nytimes-256-angular
✅ Available: nytimes-256-angular
Checking: nytimes-16-angular
✅ Available: nytimes-16-angular
Checking: word2bits-800-hamming
✅ Available: word2bits-800-hamming
Checking: lastfm-64-dot
✅ Available: lastfm-64-dot
Checking: sift-256-hamming
✅ Available: sift-256-hamming
Checking: kosarak-jaccard
✅ Available: kosarak-jaccard
Checking: movielens1m-jaccard
❌ Not available: movielens1m-jaccard
Checking: movielens10m-jaccard
✅ Available: movielens10m-jaccard
Checking: movielens20m-jaccard
❌ Not available: movielens20m-jaccard
-----------------------------------
Check completed!

zhuwenxing avatar Jan 03 '25 05:01 zhuwenxing