vector-db-benchmark icon indicating copy to clipboard operation
vector-db-benchmark copied to clipboard

Framework for benchmarking vector search engines

Results 33 vector-db-benchmark issues
Sort by recently updated
recently updated
newest added

We have https://github.com/qdrant/vector-db-benchmark/blob/master/scripts/process-benchmarks.ipynb but it only prepares the data. So web based interactive graphs would be nice. One can use plotly or dash framework. Please use [benchmarks.js](https://github.com/qdrant/landing_page/blob/master/qdrant-landing%2Fthemes%2Fqdrant%2Fstatic%2Fjs%2Fbenchmarks.js) as a reference....

enhancement
good first issue

This issue covers two tasks: - Many users try `*-default` as their starting point but default config is somewhat different across all the engines. So we should make it same...

good first issue

Some engines like ElasticSearch and OpenSearch take relatively longer to boot. It would be nice to have the wait feature in-built in the benchmarking script. Note: It's a low priority...

enhancement
nice-to-have

Would be nice if we could support pulling embedding from any Huggingface dataset. This would make the project even more useful for external users :) The spec for this could...

enhancement

This PR adds small changes that we're already using: - Adds a progress detail on the downloaded files - Allows to filter by a specific client count on the search...

From v8.10 to v8.12 the dense vector limit move from 2048 to 4096. The benchmark should adjust it accordingly in https://github.com/qdrant/vector-db-benchmark/blob/5b9bffbe7fecff24b8885650049b9e1fdc798f00/engine/clients/elasticsearch/configure.py#L53 Further docs: https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#index-vectors-knn-search old v8.10 reference: https://www.elastic.co/guide/en/elasticsearch/reference/8.10/dense-vector.html#dense-vector-params new v8.12...

Here's a sample traceback for 504 Gateway Timeout server error's on elastic client when the config/vector size leads to longer merge operations. #103 adds a way of fixing/avoiding this issue....

While running `vector-db-benchmarks` I've noticed that - we are updating the collection with `max_optimization_threads: 0` [before uploading points][max-optimization-threads-0] - and then once again with `max_optimization_threads: 1` [after upload is finished][max-optimization-threads-1]...

It's recurrent to see the following type of errors on non-local setups: ``` pymilvus.exceptions.MilvusException: ``` Full traceback: ``` Traceback (most recent call last): File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker result...

We recently introduced `delete_client` in the base client classes for adding pgvector in #91. We need to check if there are other places where this can help. e.g. replace closable...