k-NN
k-NN copied to clipboard
Is there some performance data?
Is there some performance data? Recently we want to use this plugin, but don’t know how the performance is?
Hi @JamesIsHuang,
At this moment we do not have the numbers documented. We are planning to come up with dedicated blog on this. Meanwhile this blog could give some insights https://medium.com/@kumon/how-to-realize-similarity-search-with-elasticsearch-3dd5641b9adb.
Hi @JamesIsHuang,
At this moment we do not have the numbers documented. We are planning to come up with dedicated blog on this. Meanwhile this blog could give some insights https://medium.com/@kumon/how-to-realize-similarity-search-with-elasticsearch-3dd5641b9adb.
For tens of millions or 100 million of vectors, have not been tested yet? It seems that the million-level performance is quite good.
Hi @JamesIsHuang ,
We have done performance analysys for different vector dimensions and collection. We need t formalize and put in the consumable manner. We are prioritizing the effort to bring this to the doc.
Here are some metrics for the scale you are looking at.
Data set:- 150M vectors with 128 dimensions across different indices. Algo params :- m=16, efSearch=1024, efConstruction=1024, No of data nodes :- 6, m5.12xlarge Mater nodes :- 3, m5.xlarge
Latencies:- tp50: 22ms tp90: 40ms tp99: 90ms
Hi, @vamshin This performance is really good. Did you return 1024 vectors when you searched?
Hi @JamesIsHuang, we made k dynamic and its between 50 and 1500.
Please note, we also did warmup to have graphs loaded to memory and our experiment do not account warm up time. With out warm up, initial query will take hit.
Hi, @vamshin , I found that segment merging is a very slow process, how long did it take you to merge 150 million vectors?
Hi @JamesIsHuang, sorry i dont have exact numbers but we could do that effectively by
-
Avoiding creating multiple smaller segments. Please refer doc for indexing performance tuning
-
Having more shards so that graphs are split into these shards and forcemerge could work on smaller graphs.
Hey @vamshin,
Are there any updates regarding the documentation of performance data? I'm currently doing some benchmarks for a project that utilizes the kNN Plugin for ES and keep running into degrading latencies over time. In order to facilitate debugging, it would be interesting to know what resources are required to calculate the neighbors within <=50ms.
I'm currently running 3 m5.xlarge.elasticsearch (without sharding), using 5 indices with a rather small count of documents:
docs.count
13867
11315
53216
1459242
1302
and the following index settings:
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1,
"refresh_interval": "10s",
"index.knn": true,
"index.knn.algo_param.ef_search": 512,
"index.knn.algo_param.ef_construction": 512,
"index.knn.algo_param.m": 32
}
and am storing vectors with a dimension of 100.
I see that you used 6 m5.12xlarge.elasticsearch nodes which would indicate that the kNN Plugin requires CPU as well as Memory to perform appropriately?
Any help with debugging or improving latencies would be greatly appreciated :).
Hey @vamshin,
Are there any updates regarding the documentation of performance data? I'm currently doing some benchmarks for a project that utilizes the kNN Plugin for ES and keep running into degrading latencies over time. In order to facilitate debugging, it would be interesting to know what resources are required to calculate the neighbors within <=50ms.
I'm currently running 3
m5.xlarge.elasticsearch(without sharding), using 5 indices with a rather small count of documents:docs.count 13867 11315 53216 1459242 1302and the following index settings:
"settings": { "number_of_shards": 1, "number_of_replicas": 1, "refresh_interval": "10s", "index.knn": true, "index.knn.algo_param.ef_search": 512, "index.knn.algo_param.ef_construction": 512, "index.knn.algo_param.m": 32 }and am storing vectors with a dimension of
100.I see that you used
6 m5.12xlarge.elasticsearchnodes which would indicate that the kNN Plugin requires CPU as well as Memory to perform appropriately?Any help with debugging or improving latencies would be greatly appreciated :).
Hi juliusbachnick How about your performance test now? I use 4 m5.12xlarge.elasticsearch 130M data, but query cost 200ms every, qps just 50.
Hi @vamshin , I have a question, JVM heap size is 32G, so that KNN can use is 32GB * 50% * 60% = 9.6GB, and The memory required for graphs is estimated to be 1.1 * (4 * dimension + 8 * M) bytes/vector, if I have dim 128, m = 16, so that 1.1 * (4 128 + 8 * 16) * 1,000,000 ~= 0.8 GB, I think if I want to be the best performance so that one device can only support 10M data, 100.8 = 8G less than 9.8, right?
Hi @qingfengshiran,
KNN graphs are loaded outside the ES heap. So this is not part of 32GB heap size.
For Example: Consider a machine has 100 GB of memory and the JVM uses 32 GB, the k-NN plugin uses 50% of the remaining 68 GB(i.e 100GB-32GB) which is 34 GB. If memory usage exceeds this value, KNN removes the least recently used graphs and can impact search performance.
Hi @vamshin ,I have use 4 node each node 32 core,192G memory
Data set:- 130M vectors with 128 dimensions across different indices.
Algo params :- m=16, efSearch=1024, efConstruction=1024 number_of_shards: 8,
But my top1 K = 4 every search cost about 200ms,QPS just around 50,what can I do to check my performance?
The segment 0 as follows,others same like this.

@vamshin ,My request is about 2000QPS from search data size is about higher than 400M data number which dims is 128. Can you give me some cluster device configuration and es-knn parameters?
Hi @qingfengshiran,
I could see couple of suggestions
- Increase number of shards to 16. This way we can get more parallelism.
- See if you can bring down efSearch to 512 (note your recall might come down, you may want to double check)
- Forcemerge to fewer segments. Limit number of segments to 5 per shard. Inside the shards, segments are searched sequentially, so more number of segments would result in more latency. Also please note,
forcemergeis costly operation and one node only has one thread, so it can take really long time at your scale. One way to speed up is scale horizontally by adding more instances and increase number of shards, so that work can be more distributed.
You could find more details here https://opendistro.github.io/for-elasticsearch-docs/docs/knn/performance-tuning/#search-performance-tuning
Hi @vamshin ,
Thanks for kindly reply, I also want know some other informations:
1 How is the QPS on more than 100 Million data?
2 I have found shards config suggestion: Number of Shards = Index Size / 30GB, and index size is store size?

3 At last I want to know docker compose cluster impl method, you have provide a cluster implant sample, but it is a cluster on on computer device, I want to cross different computers, one node on one device, I have tried but failed. docker compose implies on on device with 3 nodes: https://opendistro.github.io/for-elasticsearch-docs/docs/install/docker/
------data node version: '3' services: odfe-node1: image: amazon/opendistro-for-elasticsearch:1.13.2 container_name: odfe-node1 environment: - cluster.name=odfe-cluster - network.host=0.0.0.0 - node.name=odfe-node1 - node.master=true - node.ingest=true - node.data=true - cluster.initial_master_nodes=odfe-master1 # - discovery.seed_hosts=10.0.217.122,10.0.222.36,10.0.215.78,10.0.246.15,10.0.246.36,10.0.245.233 - discovery.seed_hosts=10.0.246.36,10.0.245.233 # - discovery.seed_hosts=10.0.217.122,10.0.222.36,10.0.215.78,10.0.246.15,10.0.246.36,10.0.245.233,10.0.245.10 - bootstrap.memory_lock=true # along with the memlock settings below, disables swapping - "ES_JAVA_OPTS=-Xms32768m -Xmx32768m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM ulimits: memlock: soft: -1 hard: -1 nofile: soft: 65536 # maximum number of open files for the Elasticsearch user, set to at least 65536 on modern systems hard: 65536 volumes: - /data/es/data/data:/usr/share/elasticsearch/data - /data/es/data/log:/usr/share/elasticsearch/log ports: - 9200:9200 - 9300:9300 - 9600:9600 # required for Performance Analyzer networks: - odfe-net networks: odfe-net:
docker-compose up. with the follow error

-----master node version: '3' services: odfe-master1: image: amazon/opendistro-for-elasticsearch:1.13.2 container_name: odfe-master1 environment: - cluster.name=odfe-cluster - network.host=0.0.0.0 - node.name=odfe-master1 - node.master=true - node.data=false - node.ingest=true # - discovery.seed_hosts=10.0.217.122,10.0.222.36,10.0.215.78,10.0.246.15,10.0.246.36,10.0.245.233,10.0.245.10 # - discovery.seed_hosts=10.0.217.122,10.0.222.36,10.0.215.78,10.0.246.15,10.0.246.36,10.0.245.233 - discovery.seed_hosts=10.0.246.36,10.0.245.233 - cluster.initial_master_nodes=odfe-master1 - bootstrap.memory_lock=true # along with the memlock settings below, disables swapping - "ES_JAVA_OPTS=-Xms512m -Xmx512m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM ulimits: memlock: soft: -1 hard: -1 nofile: soft: 65536 # maximum number of open files for the Elasticsearch user, set to at least 65536 on modern systems hard: 65536 volumes: - /data/es/master/data:/usr/share/elasticsearch/data - /data/es/master/log:/usr/share/elasticsearch/log ports: - 9200:9200 - 9300:9300 - 9600:9600 # required for Performance Analyzer networks: - odfe-net networks: odfe-net:
docker-compose up. with the follow error

4 How to Warm up the index?? I have see the website, but I did not find where to set the warm up https://opendistro.github.io/for-elasticsearch-docs/docs/knn/performance-tuning/
Hi @qingfengshiran
1 How is the QPS on more than 100 Million data?
We have seen QPS ranging from 100s of ms to 2000ms for around 100 million vectors, depending on memory, cpu type, number of nodes, dimensions, etc. In order to get good performance, all of the vectors need to fit in memory. You can check this from the knn stats after running warmup or some queries.
2 I have found shards config suggestion: Number of Shards = Index Size / 30GB, and index size is store size?
yes
3 At last I want to know docker compose cluster impl method, you have provide a cluster implant sample, but it is a cluster on on computer device, I want to cross different computers, one node on one device, I have tried but failed.
We will get back to you on this.
4 How to Warm up the index?? I have see the website, but I did not find where to set the warm up
Warmup documentation can be found here.
Hi @qingfengshiran
For docker container, i believe it doesn't matter whether your service (in this case data node) is in same docker-compose file or different.
I ran following experiment and it worked successfully for me.
- Created a docker compose file (data.yaml) with two nodes and kibana as below
version: '3'
services:
odfe-node1:
image: amazon/opendistro-for-elasticsearch:1.13.2
container_name: odfe-node1
environment:
- cluster.name=odfe-cluster
- node.name=odfe-node1
- discovery.seed_hosts=odfe-node1,odfe-node2,odfe-node3
- cluster.initial_master_nodes=odfe-node1,odfe-node2,odfe-node3
- bootstrap.memory_lock=true # along with the memlock settings below, disables swapping
- "ES_JAVA_OPTS=-Xms512m -Xmx512m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536 # maximum number of open files for the Elasticsearch user, set to at least 65536 on modern systems
hard: 65536
volumes:
- odfe-data1:/usr/share/elasticsearch/data
ports:
- 9200:9200
- 9600:9600 # required for Performance Analyzer
networks:
- odfe-net
odfe-node2:
image: amazon/opendistro-for-elasticsearch:1.13.2
container_name: odfe-node2
environment:
- cluster.name=odfe-cluster
- node.name=odfe-node2
- discovery.seed_hosts=odfe-node1,odfe-node2
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- odfe-data2:/usr/share/elasticsearch/data
networks:
- odfe-net
kibana:
image: amazon/opendistro-for-elasticsearch-kibana:1.13.2
container_name: odfe-kibana
ports:
- 5601:5601
expose:
- "5601"
environment:
ELASTICSEARCH_URL: https://odfe-node1:9200
ELASTICSEARCH_HOSTS: https://odfe-node1:9200
networks:
- odfe-net
volumes:
odfe-data1:
odfe-data2:
networks:
odfe-net:
- executed docker-compose up Note: You will see exception that odfe-node3 is not reachable, that is fine since we haven't started that node.
docker-compse -f data.yaml up
docker ps
3c22fb3bf9b7:~ balasvij$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
be5925a3eacc amazon/opendistro-for-elasticsearch:1.13.2 "/usr/local/bin/dock…" 2 minutes ago Up 2 minutes 0.0.0.0:9200->9200/tcp, 9300/tcp, 0.0.0.0:9600->9600/tcp, 9650/tcp odfe-node1
ac3a3e8aff1d amazon/opendistro-for-elasticsearch:1.13.2 "/usr/local/bin/dock…" 17 minutes ago Up 2 minutes 9200/tcp, 9300/tcp, 9600/tcp, 9650/tcp odfe-node2
7f96f019bda0 amazon/opendistro-for-elasticsearch-kibana:1.13.2 "/usr/local/bin/kiba…" 17 minutes ago Up 2 minutes 0.0.0.0:5601->5601/tcp odfe-kibana
curl https://localhost:9200/_cat/nodes -u admin:admin --insecure
172.17.0.2 38 81 14 1.41 0.67 0.38 dimr - odfe-node2
172.17.0.4 42 81 9 1.41 0.67 0.38 dimr * odfe-node1
- Create docker-compose file node3.yaml as below
version: '3'
services:
odfe-node3:
image: amazon/opendistro-for-elasticsearch:1.13.2
container_name: odfe-node3
environment:
- cluster.name=odfe-cluster
- node.name=odfe-node3
- discovery.seed_hosts=odfe-node1,odfe-node2,odfe-node3
- cluster.initial_master_nodes=odfe-node1,odfe-node2.odfe-node3
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- odfe-data3:/usr/share/elasticsearch/data
networks:
- odfe-net
volumes:
odfe-data3:
networks:
odfe-net:
- execute node3.yaml
docker-compose -f node3.yaml up
3c22fb3bf9b7:~ balasvij$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
6ad96f65fcbe amazon/opendistro-for-elasticsearch:1.13.2 "/usr/local/bin/dock…" 35 seconds ago Up 32 seconds 9200/tcp, 9300/tcp, 9600/tcp, 9650/tcp odfe-node3
be5925a3eacc amazon/opendistro-for-elasticsearch:1.13.2 "/usr/local/bin/dock…" 2 minutes ago Up 2 minutes 0.0.0.0:9200->9200/tcp, 9300/tcp, 0.0.0.0:9600->9600/tcp, 9650/tcp odfe-node1
ac3a3e8aff1d amazon/opendistro-for-elasticsearch:1.13.2 "/usr/local/bin/dock…" 17 minutes ago Up 2 minutes 9200/tcp, 9300/tcp, 9600/tcp, 9650/tcp odfe-node2
7f96f019bda0 amazon/opendistro-for-elasticsearch-kibana:1.13.2 "/usr/local/bin/kiba…" 17 minutes ago Up 2 minutes 0.0.0.0:5601->5601/tcp odfe-kibana
curl https://localhost:9200/_cat/nodes -u admin:admin --insecure
172.17.0.2 20 97 10 1.02 0.68 0.40 dimr - odfe-node2
172.17.0.5 44 97 26 1.02 0.68 0.40 dimr - odfe-node3
172.17.0.4 47 97 10 1.02 0.68 0.40 dimr * odfe-node1
Let me know if you still have issues. I executed the above experiment in MacOs with docker