k-NN icon indicating copy to clipboard operation
k-NN copied to clipboard

Is there some performance data?

Open JamesIsHuang opened this issue 5 years ago • 17 comments

Is there some performance data? Recently we want to use this plugin, but don’t know how the performance is?

JamesIsHuang avatar Jul 22 '20 08:07 JamesIsHuang

Hi @JamesIsHuang,

At this moment we do not have the numbers documented. We are planning to come up with dedicated blog on this. Meanwhile this blog could give some insights https://medium.com/@kumon/how-to-realize-similarity-search-with-elasticsearch-3dd5641b9adb.

vamshin avatar Jul 22 '20 18:07 vamshin

Hi @JamesIsHuang,

At this moment we do not have the numbers documented. We are planning to come up with dedicated blog on this. Meanwhile this blog could give some insights https://medium.com/@kumon/how-to-realize-similarity-search-with-elasticsearch-3dd5641b9adb.

For tens of millions or 100 million of vectors, have not been tested yet? It seems that the million-level performance is quite good.

JamesIsHuang avatar Jul 23 '20 04:07 JamesIsHuang

Hi @JamesIsHuang ,

We have done performance analysys for different vector dimensions and collection. We need t formalize and put in the consumable manner. We are prioritizing the effort to bring this to the doc.

Here are some metrics for the scale you are looking at.

Data set:- 150M vectors with 128 dimensions across different indices. Algo params :- m=16, efSearch=1024, efConstruction=1024, No of data nodes :- 6, m5.12xlarge Mater nodes :- 3, m5.xlarge

Latencies:- tp50: 22ms tp90: 40ms tp99: 90ms

vamshin avatar Jul 23 '20 04:07 vamshin

Hi, @vamshin This performance is really good. Did you return 1024 vectors when you searched?

JamesIsHuang avatar Jul 23 '20 08:07 JamesIsHuang

Hi @JamesIsHuang, we made k dynamic and its between 50 and 1500.

Please note, we also did warmup to have graphs loaded to memory and our experiment do not account warm up time. With out warm up, initial query will take hit.

vamshin avatar Jul 23 '20 18:07 vamshin

Hi, @vamshin , I found that segment merging is a very slow process, how long did it take you to merge 150 million vectors?

JamesIsHuang avatar Jul 29 '20 06:07 JamesIsHuang

Hi @JamesIsHuang, sorry i dont have exact numbers but we could do that effectively by

  1. Avoiding creating multiple smaller segments. Please refer doc for indexing performance tuning

  2. Having more shards so that graphs are split into these shards and forcemerge could work on smaller graphs.

vamshin avatar Aug 04 '20 19:08 vamshin

Hey @vamshin,

Are there any updates regarding the documentation of performance data? I'm currently doing some benchmarks for a project that utilizes the kNN Plugin for ES and keep running into degrading latencies over time. In order to facilitate debugging, it would be interesting to know what resources are required to calculate the neighbors within <=50ms.

I'm currently running 3 m5.xlarge.elasticsearch (without sharding), using 5 indices with a rather small count of documents:

docs.count
13867
11315
53216
1459242
1302

and the following index settings:

    "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 1,
      "refresh_interval": "10s",
      "index.knn": true,
      "index.knn.algo_param.ef_search": 512,
      "index.knn.algo_param.ef_construction": 512,
      "index.knn.algo_param.m": 32
    }

and am storing vectors with a dimension of 100.

I see that you used 6 m5.12xlarge.elasticsearch nodes which would indicate that the kNN Plugin requires CPU as well as Memory to perform appropriately?

Any help with debugging or improving latencies would be greatly appreciated :).

juliusbachnick avatar Apr 07 '21 16:04 juliusbachnick

Hey @vamshin,

Are there any updates regarding the documentation of performance data? I'm currently doing some benchmarks for a project that utilizes the kNN Plugin for ES and keep running into degrading latencies over time. In order to facilitate debugging, it would be interesting to know what resources are required to calculate the neighbors within <=50ms.

I'm currently running 3 m5.xlarge.elasticsearch (without sharding), using 5 indices with a rather small count of documents:

docs.count
13867
11315
53216
1459242
1302

and the following index settings:

    "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 1,
      "refresh_interval": "10s",
      "index.knn": true,
      "index.knn.algo_param.ef_search": 512,
      "index.knn.algo_param.ef_construction": 512,
      "index.knn.algo_param.m": 32
    }

and am storing vectors with a dimension of 100.

I see that you used 6 m5.12xlarge.elasticsearch nodes which would indicate that the kNN Plugin requires CPU as well as Memory to perform appropriately?

Any help with debugging or improving latencies would be greatly appreciated :).

Hi juliusbachnick How about your performance test now? I use 4 m5.12xlarge.elasticsearch 130M data, but query cost 200ms every, qps just 50.

qingfengshiran avatar Apr 28 '21 03:04 qingfengshiran

Hi @vamshin , I have a question, JVM heap size is 32G, so that KNN can use is 32GB * 50% * 60% = 9.6GB, and The memory required for graphs is estimated to be 1.1 * (4 * dimension + 8 * M) bytes/vector, if I have dim 128, m = 16, so that 1.1 * (4 128 + 8 * 16) * 1,000,000 ~= 0.8 GB, I think if I want to be the best performance so that one device can only support 10M data, 100.8 = 8G less than 9.8, right?

qingfengshiran avatar Apr 28 '21 03:04 qingfengshiran

Hi @qingfengshiran,

KNN graphs are loaded outside the ES heap. So this is not part of 32GB heap size.

For Example: Consider a machine has 100 GB of memory and the JVM uses 32 GB, the k-NN plugin uses 50% of the remaining 68 GB(i.e 100GB-32GB) which is 34 GB. If memory usage exceeds this value, KNN removes the least recently used graphs and can impact search performance.

vamshin avatar Apr 28 '21 03:04 vamshin

Hi @vamshin ,I have use 4 node each node 32 core,192G memory Data set:- 130M vectors with 128 dimensions across different indices. Algo params :- m=16, efSearch=1024, efConstruction=1024 number_of_shards: 8, But my top1 K = 4 every search cost about 200ms,QPS just around 50,what can I do to check my performance? The segment 0 as follows,others same like this. WeChatWorkScreenshot_14e40ea6-7008-4cb3-910c-8b0b36ebbba1

qingfengshiran avatar Apr 28 '21 06:04 qingfengshiran

@vamshin ,My request is about 2000QPS from search data size is about higher than 400M data number which dims is 128. Can you give me some cluster device configuration and es-knn parameters?

qingfengshiran avatar Apr 28 '21 06:04 qingfengshiran

Hi @qingfengshiran,

I could see couple of suggestions

  1. Increase number of shards to 16. This way we can get more parallelism.
  2. See if you can bring down efSearch to 512 (note your recall might come down, you may want to double check)
  3. Forcemerge to fewer segments. Limit number of segments to 5 per shard. Inside the shards, segments are searched sequentially, so more number of segments would result in more latency. Also please note, forcemerge is costly operation and one node only has one thread, so it can take really long time at your scale. One way to speed up is scale horizontally by adding more instances and increase number of shards, so that work can be more distributed.

You could find more details here https://opendistro.github.io/for-elasticsearch-docs/docs/knn/performance-tuning/#search-performance-tuning

vamshin avatar Apr 29 '21 03:04 vamshin

Hi @vamshin , Thanks for kindly reply, I also want know some other informations: 1 How is the QPS on more than 100 Million data? 2 I have found shards config suggestion: Number of Shards = Index Size / 30GB, and index size is store size? image

3 At last I want to know docker compose cluster impl method, you have provide a cluster implant sample, but it is a cluster on on computer device, I want to cross different computers, one node on one device, I have tried but failed. docker compose implies on on device with 3 nodes: https://opendistro.github.io/for-elasticsearch-docs/docs/install/docker/

------data node version: '3' services: odfe-node1: image: amazon/opendistro-for-elasticsearch:1.13.2 container_name: odfe-node1 environment: - cluster.name=odfe-cluster - network.host=0.0.0.0 - node.name=odfe-node1 - node.master=true - node.ingest=true - node.data=true - cluster.initial_master_nodes=odfe-master1 # - discovery.seed_hosts=10.0.217.122,10.0.222.36,10.0.215.78,10.0.246.15,10.0.246.36,10.0.245.233 - discovery.seed_hosts=10.0.246.36,10.0.245.233 # - discovery.seed_hosts=10.0.217.122,10.0.222.36,10.0.215.78,10.0.246.15,10.0.246.36,10.0.245.233,10.0.245.10 - bootstrap.memory_lock=true # along with the memlock settings below, disables swapping - "ES_JAVA_OPTS=-Xms32768m -Xmx32768m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM ulimits: memlock: soft: -1 hard: -1 nofile: soft: 65536 # maximum number of open files for the Elasticsearch user, set to at least 65536 on modern systems hard: 65536 volumes: - /data/es/data/data:/usr/share/elasticsearch/data - /data/es/data/log:/usr/share/elasticsearch/log ports: - 9200:9200 - 9300:9300 - 9600:9600 # required for Performance Analyzer networks: - odfe-net networks: odfe-net:

docker-compose up. with the follow error image

-----master node version: '3' services: odfe-master1: image: amazon/opendistro-for-elasticsearch:1.13.2 container_name: odfe-master1 environment: - cluster.name=odfe-cluster - network.host=0.0.0.0 - node.name=odfe-master1 - node.master=true - node.data=false - node.ingest=true # - discovery.seed_hosts=10.0.217.122,10.0.222.36,10.0.215.78,10.0.246.15,10.0.246.36,10.0.245.233,10.0.245.10 # - discovery.seed_hosts=10.0.217.122,10.0.222.36,10.0.215.78,10.0.246.15,10.0.246.36,10.0.245.233 - discovery.seed_hosts=10.0.246.36,10.0.245.233 - cluster.initial_master_nodes=odfe-master1 - bootstrap.memory_lock=true # along with the memlock settings below, disables swapping - "ES_JAVA_OPTS=-Xms512m -Xmx512m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM ulimits: memlock: soft: -1 hard: -1 nofile: soft: 65536 # maximum number of open files for the Elasticsearch user, set to at least 65536 on modern systems hard: 65536 volumes: - /data/es/master/data:/usr/share/elasticsearch/data - /data/es/master/log:/usr/share/elasticsearch/log ports: - 9200:9200 - 9300:9300 - 9600:9600 # required for Performance Analyzer networks: - odfe-net networks: odfe-net:

docker-compose up. with the follow error image

4 How to Warm up the index?? I have see the website, but I did not find where to set the warm up https://opendistro.github.io/for-elasticsearch-docs/docs/knn/performance-tuning/

qingfengshiran avatar Apr 29 '21 05:04 qingfengshiran

Hi @qingfengshiran

1 How is the QPS on more than 100 Million data?

We have seen QPS ranging from 100s of ms to 2000ms for around 100 million vectors, depending on memory, cpu type, number of nodes, dimensions, etc. In order to get good performance, all of the vectors need to fit in memory. You can check this from the knn stats after running warmup or some queries.

2 I have found shards config suggestion: Number of Shards = Index Size / 30GB, and index size is store size?

yes

3 At last I want to know docker compose cluster impl method, you have provide a cluster implant sample, but it is a cluster on on computer device, I want to cross different computers, one node on one device, I have tried but failed.

We will get back to you on this.

4 How to Warm up the index?? I have see the website, but I did not find where to set the warm up

Warmup documentation can be found here.

jmazanec15 avatar May 10 '21 21:05 jmazanec15

Hi @qingfengshiran

For docker container, i believe it doesn't matter whether your service (in this case data node) is in same docker-compose file or different.

I ran following experiment and it worked successfully for me.

  1. Created a docker compose file (data.yaml) with two nodes and kibana as below
version: '3'
services:
  odfe-node1:
    image: amazon/opendistro-for-elasticsearch:1.13.2
    container_name: odfe-node1
    environment:
      - cluster.name=odfe-cluster
      - node.name=odfe-node1
      - discovery.seed_hosts=odfe-node1,odfe-node2,odfe-node3
      - cluster.initial_master_nodes=odfe-node1,odfe-node2,odfe-node3
      - bootstrap.memory_lock=true # along with the memlock settings below, disables swapping
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536 # maximum number of open files for the Elasticsearch user, set to at least 65536 on modern systems
        hard: 65536
    volumes:
      - odfe-data1:/usr/share/elasticsearch/data
    ports:
      - 9200:9200
      - 9600:9600 # required for Performance Analyzer
    networks:
      - odfe-net
  odfe-node2:
    image: amazon/opendistro-for-elasticsearch:1.13.2
    container_name: odfe-node2
    environment:
      - cluster.name=odfe-cluster
      - node.name=odfe-node2
      - discovery.seed_hosts=odfe-node1,odfe-node2
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - odfe-data2:/usr/share/elasticsearch/data
    networks:
      - odfe-net
  kibana:
    image: amazon/opendistro-for-elasticsearch-kibana:1.13.2
    container_name: odfe-kibana
    ports:
      - 5601:5601
    expose:
      - "5601"
    environment:
      ELASTICSEARCH_URL: https://odfe-node1:9200
      ELASTICSEARCH_HOSTS: https://odfe-node1:9200
    networks:
      - odfe-net

volumes:
  odfe-data1:
  odfe-data2:

networks:
  odfe-net:
  1. executed docker-compose up Note: You will see exception that odfe-node3 is not reachable, that is fine since we haven't started that node.
docker-compse -f data.yaml up 


docker ps

3c22fb3bf9b7:~ balasvij$ docker ps
CONTAINER ID   IMAGE                                               COMMAND                  CREATED          STATUS          PORTS                                                                NAMES
be5925a3eacc   amazon/opendistro-for-elasticsearch:1.13.2          "/usr/local/bin/dock…"   2 minutes ago    Up 2 minutes    0.0.0.0:9200->9200/tcp, 9300/tcp, 0.0.0.0:9600->9600/tcp, 9650/tcp   odfe-node1
ac3a3e8aff1d   amazon/opendistro-for-elasticsearch:1.13.2          "/usr/local/bin/dock…"   17 minutes ago   Up 2 minutes    9200/tcp, 9300/tcp, 9600/tcp, 9650/tcp                               odfe-node2
7f96f019bda0   amazon/opendistro-for-elasticsearch-kibana:1.13.2   "/usr/local/bin/kiba…"   17 minutes ago   Up 2 minutes    0.0.0.0:5601->5601/tcp                                               odfe-kibana


curl https://localhost:9200/_cat/nodes -u admin:admin --insecure
172.17.0.2 38 81 14 1.41 0.67 0.38 dimr - odfe-node2
172.17.0.4 42 81  9 1.41 0.67 0.38 dimr * odfe-node1

  1. Create docker-compose file node3.yaml as below
version: '3'
services:
  odfe-node3:
    image: amazon/opendistro-for-elasticsearch:1.13.2
    container_name: odfe-node3
    environment:
      - cluster.name=odfe-cluster
      - node.name=odfe-node3
      - discovery.seed_hosts=odfe-node1,odfe-node2,odfe-node3
      - cluster.initial_master_nodes=odfe-node1,odfe-node2.odfe-node3
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - odfe-data3:/usr/share/elasticsearch/data
    networks:
      - odfe-net
volumes:
  odfe-data3:

networks:
  odfe-net:
  1. execute node3.yaml
docker-compose -f node3.yaml up

3c22fb3bf9b7:~ balasvij$ docker ps
CONTAINER ID   IMAGE                                               COMMAND                  CREATED          STATUS          PORTS                                                                NAMES
6ad96f65fcbe   amazon/opendistro-for-elasticsearch:1.13.2          "/usr/local/bin/dock…"   35 seconds ago   Up 32 seconds   9200/tcp, 9300/tcp, 9600/tcp, 9650/tcp                               odfe-node3
be5925a3eacc   amazon/opendistro-for-elasticsearch:1.13.2          "/usr/local/bin/dock…"   2 minutes ago    Up 2 minutes    0.0.0.0:9200->9200/tcp, 9300/tcp, 0.0.0.0:9600->9600/tcp, 9650/tcp   odfe-node1
ac3a3e8aff1d   amazon/opendistro-for-elasticsearch:1.13.2          "/usr/local/bin/dock…"   17 minutes ago   Up 2 minutes    9200/tcp, 9300/tcp, 9600/tcp, 9650/tcp                               odfe-node2
7f96f019bda0   amazon/opendistro-for-elasticsearch-kibana:1.13.2   "/usr/local/bin/kiba…"   17 minutes ago   Up 2 minutes    0.0.0.0:5601->5601/tcp                                               odfe-kibana

curl https://localhost:9200/_cat/nodes -u admin:admin --insecure
172.17.0.2 20 97 10 1.02 0.68 0.40 dimr - odfe-node2
172.17.0.5 44 97 26 1.02 0.68 0.40 dimr - odfe-node3
172.17.0.4 47 97 10 1.02 0.68 0.40 dimr * odfe-node1


Let me know if you still have issues. I executed the above experiment in MacOs with docker

VijayanB avatar May 10 '21 23:05 VijayanB