raft icon indicating copy to clipboard operation
raft copied to clipboard

Add raft-ann-bench script to generate synthetic dataset

Open tfeher opened this issue 1 year ago • 2 comments

This PR adds a Python script to generate synthetic dataset.

tfeher avatar Nov 23 '23 15:11 tfeher

python -m raft-ann-bench.generate_dataset --rows 1000000 --cols 128 --dtype float32 dataset/base.fbin

 # After the dataset is generated, you can create query and ground truth files

python -m raft-ann-bench.generate_groundtruth dataset/base.fbin --output=dataset --queries=random --n_queries=10000

tfeher avatar Nov 23 '23 15:11 tfeher

@tfeher this is still a really valuable feature to have. I'm going to push to 24.08, given the looming code freeze. Also cc @dantegd since you are refactoring the Python APIs.

cjnolet avatar May 21 '24 15:05 cjnolet