k-NN icon indicating copy to clipboard operation
k-NN copied to clipboard

knn search

Open fendoukobe opened this issue 3 years ago • 12 comments

""" GET my-knn-index-1/_search { "size": 2, "query": { "knn": { "my_vector2": { "vector": [2, 3, 5, 6], "k": 2 } } } } k is the number of neighbors the search of each graph will return. You must also include the size option. This option indicates how many results the query actually returns. The plugin returns k amount of results for each shard (and each segment) and size amount of results for the entire query. The plugin supports a maximum k value of 10,000."""

hi,I want to know what the K in this passage means, And,In development I should set the appropriate value for this k, 10 or 100? or something else

fendoukobe avatar Jun 29 '21 06:06 fendoukobe

It depends on your usecase. By setting k and size to 10, for example, you get the closest 10 results for your query (aka top 10 neighbors of the vector in your query).

neo-anderson avatar Jul 07 '21 20:07 neo-anderson

thanks,i get it

The KNN index has millions of docs,Knn search fast but,When the number reaches tens of millions, the speed is very slow, more than 20 seconds, How do I optimize it? i want to warmup index, How to calculate the memory required?

fendoukobe avatar Jul 08 '21 02:07 fendoukobe

and how to cancel the warmup index? Whether the validity time is related to the parameter "knn.cache.item. Expiry. Minutes ": "10m"? thanks

fendoukobe avatar Jul 08 '21 03:07 fendoukobe

hi, I see that KNN has a branch - Faiss Support. Whether this branch queries faster and uses less memory? How to use it in my Elasticsearch cluster?

fendoukobe avatar Jul 09 '21 00:07 fendoukobe

The KNN index has millions of docs,Knn search fast but,When the number reaches tens of millions, the speed is very slow, more than 20 seconds,

Check if your query is a bruteforce script or Approximate k-NN : https://opendistro.github.io/for-elasticsearch-docs/docs/knn/

neo-anderson avatar Jul 09 '21 04:07 neo-anderson

Approximate k-NN Search

fendoukobe avatar Jul 09 '21 05:07 fendoukobe

Hi @fendoukobe

Here is how we calculate memory: https://opendistro.github.io/for-elasticsearch-docs/docs/knn/performance-tuning/#estimating-memory-usage.

After the slow query, could you paste the knn stats? GET /_opendistro/_knn/stats?pretty"

With regards to faiss support, we are actively working on it here. I am working on an RFC and will post soon. We want to support faiss's product quantization in order reduce memory consumption. The branch on this repo is a development branch and should not be used in production. It only includes faiss's HNSW implementation, which should not have significant performance differences compared to nmslib.

jmazanec15 avatar Jul 09 '21 23:07 jmazanec15

嗨@fendoukobe

以下是我们计算内存的方法:https : //opendistro.github.io/for-elasticsearch-docs/docs/knn/performance-tuning/#estimating-memory-usage

在慢查询之后,你能粘贴 knn 统计数据吗? GET /_opendistro/_knn/stats?pretty"

关于 faiss 支持,我们正在这里积极工作。我正在研究 RFC,很快就会发布。我们希望支持 faiss 的乘积量化以减少内存消耗。此 repo 上的分支是开发分支,不应在生产中使用。它只包括 faiss 的 HNSW 实现,与 nmslib 相比,它不应有显着的性能差异。

sorry,i can not provide the data now ,because the production environment is somewhere else. But I can confirm that the memory footprint ratio is no more than 20%。 thank you

fendoukobe avatar Jul 12 '21 01:07 fendoukobe

@fendoukobe I see. What is the dimension on your vectors? Also, how many nodes are you running on and what type of machines are you using?

jmazanec15 avatar Jul 19 '21 21:07 jmazanec15

the demension is 1024, There are three nodes, Each node has 512GB of memory, two physical CPUs, and cores for one physical CPU are 16 A server becomes two virtual nodes, sharing memory and CPU, Hot nodes are solid-state drives, warm nodes are mechanical drives

My configuration is as follows

PUT /_cluster/settings { "persistent": { "knn.cache.item.expiry.enabled": true, "knn.cache.item.expiry.minutes": "10m", "knn.memory.circuit_breaker.limit": "60%", "knn.circuit_breaker.unset.percentage": 90, "knn.algo_param.index_thread_qty": 32 } }

fendoukobe avatar Jul 20 '21 09:07 fendoukobe

One potential way to speed up is to not return the vector field in your query and only return the document id (if your use case lets you). This can be done by adding the query parameter ?_source_exclude=my_vector2.

Can you provide the query you are using in the case of high latency?

jmazanec15 avatar Jul 28 '21 18:07 jmazanec15

One potential way to speed up is to not return the vector field in your query and only return the document id (if your use case lets you). This can be done by adding the query parameter ?_source_exclude=my_vector2.

Can you provide the query you are using in the case of high latency?

good idea

zxbing avatar Apr 15 '22 09:04 zxbing