raft
raft copied to clipboard
Add raft-ann-bench script to generate synthetic dataset
This PR adds a Python script to generate synthetic dataset.
python -m raft-ann-bench.generate_dataset --rows 1000000 --cols 128 --dtype float32 dataset/base.fbin
# After the dataset is generated, you can create query and ground truth files
python -m raft-ann-bench.generate_groundtruth dataset/base.fbin --output=dataset --queries=random --n_queries=10000
@tfeher this is still a really valuable feature to have. I'm going to push to 24.08, given the looming code freeze. Also cc @dantegd since you are refactoring the Python APIs.