milvus icon indicating copy to clipboard operation
milvus copied to clipboard

[Bug]: [benchmark] The indexnode cpu is not released after the build diskann index is completed

Open elstic opened this issue 10 months ago • 4 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Environment

- Milvus version:2.4-20240417-8f7ac8f7-amd64
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

case: test_concurrent_locust_100m_diskann_ddl_dql_filter_output_cluster argo task : fouramf-hph7c

The client log shows that index creation is complete. client pod: fouramf-hph7c-3277944962

[2024-04-17 13:52:00,701 -  INFO - fouram]: [Base] Index params of fouram_9jrPTDk7:[{'float_vector': {'index_type': 'DISKANN', 'metric_ty
pe': 'L2', 'params': {}}}] (base.py:481)
[2024-04-17 13:52:00,702 -  INFO - fouram]: [Base] Start build index of DISKANN for field:float_vector collection:fouram_9jrPTDk7, params
:{'index_type': 'DISKANN', 'metric_type': 'L2', 'params': {}}, kwargs:{} (base.py:462)
[2024-04-17 19:29:27,842 -  INFO - fouram]: [Time] Index run in 20247.1398s (api_request.py:49)
[2024-04-17 19:29:27,843 -  INFO - fouram]: [CommonCases] RT of build index DISKANN: 20247.1398s (common_cases.py:150)
[2024-04-17 19:29:27,843 -  INFO - fouram]: [CommonCases] Prepare index DISKANN done. (common_cases.py:152)

2024-04-17 19:29:27 Indexing is finished, then concurrent query, search and other operations, but the indexnode cpu is not released.

image

index tasks are shown as incomplete: image

test env: 4am cluster , server:

fouramf-hph7c-38-1896-etcd-0                                      1/1     Running                           0               16h     10.104.20.187   4am-node22   <none>           <none>
fouramf-hph7c-38-1896-etcd-1                                      1/1     Running                           0               16h     10.104.19.63    4am-node28   <none>           <none>
fouramf-hph7c-38-1896-etcd-2                                      1/1     Running                           0               16h     10.104.33.130   4am-node36   <none>           <none>
fouramf-hph7c-38-1896-milvus-datacoord-869599bcd8-mgnpj           1/1     Running                           0               16h     10.104.20.176   4am-node22   <none>           <none>
fouramf-hph7c-38-1896-milvus-datanode-6ccbb964d-6qlzr             1/1     Running                           1 (16h ago)     16h     10.104.15.215   4am-node20   <none>           <none>
fouramf-hph7c-38-1896-milvus-indexcoord-66dd7985d7-7rltv          1/1     Running                           0               16h     10.104.15.216   4am-node20   <none>           <none>
fouramf-hph7c-38-1896-milvus-indexnode-69df46cdfb-2rqdv           1/1     Running                           0               16h     10.104.20.171   4am-node22   <none>           <none>
fouramf-hph7c-38-1896-milvus-proxy-866ffd446f-ptrsg               1/1     Running                           1 (16h ago)     16h     10.104.20.170   4am-node22   <none>           <none>
fouramf-hph7c-38-1896-milvus-querycoord-7dfbcbd946-pnvnp          1/1     Running                           1 (16h ago)     16h     10.104.33.122   4am-node36   <none>           <none>
fouramf-hph7c-38-1896-milvus-querynode-859f8b5bfd-dwr4p           1/1     Running                           0               16h     10.104.15.217   4am-node20   <none>           <none>
fouramf-hph7c-38-1896-milvus-rootcoord-7758f9db84-q5zgq           1/1     Running                           0               16h     10.104.20.173   4am-node22   <none>           <none>
fouramf-hph7c-38-1896-minio-0                                     1/1     Running                           0               16h     10.104.19.62    4am-node28   <none>           <none>
fouramf-hph7c-38-1896-minio-1                                     1/1     Running                           0               16h     10.104.20.186   4am-node22   <none>           <none>
fouramf-hph7c-38-1896-minio-2                                     1/1     Running                           0               16h     10.104.33.131   4am-node36   <none>           <none>
fouramf-hph7c-38-1896-minio-3                                     1/1     Running                           0               16h     10.104.34.35    4am-node37   <none>           <none>
fouramf-hph7c-38-1896-pulsar-bookie-0                             1/1     Running                           0               16h     10.104.20.185   4am-node22   <none>           <none>
fouramf-hph7c-38-1896-pulsar-bookie-1                             1/1     Running                           0               16h     10.104.19.66    4am-node28   <none>           <none>
fouramf-hph7c-38-1896-pulsar-bookie-2                             1/1     Running                           0               16h     10.104.33.132   4am-node36   <none>           <none>
fouramf-hph7c-38-1896-pulsar-bookie-init-68642                    0/1     Completed                         0               16h     10.104.20.178   4am-node22   <none>           <none>
fouramf-hph7c-38-1896-pulsar-broker-0                             1/1     Running                           0               16h     10.104.15.214   4am-node20   <none>           <none>
fouramf-hph7c-38-1896-pulsar-proxy-0                              1/1     Running                           0               16h     10.104.20.180   4am-node22   <none>           <none>
fouramf-hph7c-38-1896-pulsar-pulsar-init-rqjcq                    0/1     Completed                         0               16h     10.104.20.172   4am-node22   <none>           <none>
fouramf-hph7c-38-1896-pulsar-recovery-0                           1/1     Running                           0               16h     10.104.33.123   4am-node36   <none>           <none>
fouramf-hph7c-38-1896-pulsar-zookeeper-0                          1/1     Running                           0               16h     10.104.33.126   4am-node36   <none>           <none>
fouramf-hph7c-38-1896-pulsar-zookeeper-1                          1/1     Running                           0               16h     10.104.20.189   4am-node22   <none>           <none>
fouramf-hph7c-38-1896-pulsar-zookeeper-2                          1/1     Running                           0               16h     10.104.19.68    4am-node28   <none>           <none>

Expected Behavior

No response

Steps To Reproduce

1. create a collection 
  2. build an DiskANN index on the vector column
  3. insert 100 million vectors
  4. flush collection
  5. build index on vector column with the same parameters  
  6. count the total number of rows
  7. load collection
  8. execute concurrent search, query,load,scene_test 
    (scene_test steps: 
       1) Create a collection 2) Insert 3000 pieces of data  3) flush collection 
       4) Create an index  5) drop collection)
  9. step 8 lasts 12h

Milvus Log

No response

Anything else?

No response

elstic avatar Apr 18 '24 04:04 elstic