jvector
jvector copied to clipboard
Bench improvements
This PR introduces a few ease-of-use tools in jvector-examples.
- Bench now loads the list of available datasets from a YAML file. The list is provided in
jvector-examples/yaml-examples/datasets.yml. - It creates
BenchYAMLthat allows to read config files with JVector hyperparameters in YAML format. - It creates
HelloVectorWorldwith a single, clean, and simple example
Here's an example YAML file showing what and how can be specified:
configVersion: 4 # do not change this number unless you know what you are doing
dataset: ada002-100k
construction:
outDegree: [32, 48, 64, 96, 128]
efConstruction: [60, 80, 100, 120, 160, 200, 400, 600, 800]
neighborOverflow: [1.2f, 2.0f]
addHierarchy: [No, Yes]
compression:
- type: None
- type: PQ
parameters:
# m: 192 # we can either specify the integer m or the integer mFactor. In this case, m will be set to the data dimensionality divided by mFactor
# mFactor: 8
# k: 256 # optional parameter. By default, k=256
centerData: No
anisotropicThreshold: -1.0 # optional parameter. By default, anisotropicThreshold=-1 (i.e., no anisotropy)
- type: PQ
parameters:
mFactor: 2
centerData: No
reranking:
- FP
- NVQ
useSavedIndexIfExists: Yes
search:
topKOverquery:
# the value of topK followed by a list with the overquery rates we want to cover
10: [1.0, 2.0, 5.0, 10.0]
100: [1.0, 2.0]
useSearchPruning: [No, Yes]
compression:
- type: None
- type: PQ
parameters:
m: 192
k: 256 # optional parameter. By default, k=256
centerData: No
anisotropicThreshold: -1.0 # optional parameter. By default, anisotropicThreshold=-1 (i.e., no anisotropy)