milvus icon indicating copy to clipboard operation
milvus copied to clipboard

[Bug]: The milvus server did not receive any requests but dataNode memory was doubled

Open ThreadDao opened this issue 1 year ago • 9 comments
trafficstars

Is there an existing issue for this?

  • [X] I have searched the existing issues

Environment

- Milvus version: master-20240528-5e39aa92-amd64
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

  1. deploy milvus with 2 dataNodes
  2. create collection with 2 shards -> index -> insert 3m-128d data -> flush -> index and load
  3. concurrent requests: insert + delete + flush + search
                                 'concurrent_params': {'concurrent_number': 50,
                                                       'during_time': '5h',
                                                       'interval': 120,
                                                       'spawn_rate': None},
                                 'concurrent_tasks': [{'type': 'search',
                                                       'weight': 21,
                                                       'params': {'nq': 100,
                                                                  'random_data': True,
                                                                  'top_k': 100,
                                                                  'search_param': {'ef': 128},
                                                                  'timeout': 120}},
                                                      {'type': 'insert',
                                                       'weight': 10,
                                                       'params': {'nb': 200,
                                                                  'start_id': 3000000,
                                                                  'random_id': True,
                                                                  'random_vector': True,
                                                                  'timeout': 60}},
                                                      {'type': 'delete',
                                                       'weight': 10,
                                                       'params': {'delete_length': 150,
                                                                  'timeout': 60}},
                                                      {'type': 'flush',
                                                       'weight': 5,
                                                       'params': {'timeout': 60}}]},
  1. All requests finished, compaction completed, GC completed. The memory of dataNode suddenly doubled metrics of level-zero-insert-op-11-2267 image image

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

pods:

level-zero-insert-op-11-2267-etcd-0                               0/1     Pending                  0                27h     <none>          <none>       <none>           <none>
level-zero-insert-op-11-2267-etcd-1                               1/1     Running                  0                31h     10.104.15.40    4am-node20   <none>           <none>
level-zero-insert-op-11-2267-etcd-2                               1/1     Running                  0                31h     10.104.25.146   4am-node30   <none>           <none>
level-zero-insert-op-11-2267-milvus-datanode-67fc694b86-46qbw     1/1     Running                  0                31h     10.104.1.94     4am-node10   <none>           <none>
level-zero-insert-op-11-2267-milvus-datanode-67fc694b86-8kzsj     1/1     Running                  0                31h     10.104.14.51    4am-node18   <none>           <none>
level-zero-insert-op-11-2267-milvus-indexnode-95c669f79-9vtrk     1/1     Running                  0                31h     10.104.27.179   4am-node31   <none>           <none>
level-zero-insert-op-11-2267-milvus-indexnode-95c669f79-qv8gz     1/1     Running                  0                31h     10.104.18.252   4am-node25   <none>           <none>
level-zero-insert-op-11-2267-milvus-mixcoord-d59c44479-n7wt8      1/1     Running                  0                31h     10.104.18.251   4am-node25   <none>           <none>
level-zero-insert-op-11-2267-milvus-proxy-7894bcd57f-xrjmg        1/1     Running                  0                31h     10.104.14.52    4am-node18   <none>           <none>
level-zero-insert-op-11-2267-milvus-querynode-0-66cf9dd8566bhwl   1/1     Running                  0                31h     10.104.30.173   4am-node38   <none>           <none>
level-zero-insert-op-11-2267-milvus-querynode-0-66cf9dd856tb5t6   1/1     Running                  0                31h     10.104.1.95     4am-node10   <none>           <none>
level-zero-insert-op-11-2267-milvus-querynode-0-66cf9dd856wtlwq   1/1     Running                  0                31h     10.104.14.53    4am-node18   <none>           <none>
level-zero-insert-op-11-2267-milvus-querynode-0-66cf9dd856zss98   1/1     Running                  0                31h     10.104.16.97    4am-node21   <none>           <none>
level-zero-insert-op-11-2267-minio-0                              1/1     Running                  0                31h     10.104.15.41    4am-node20   <none>           <none>
level-zero-insert-op-11-2267-minio-1                              1/1     Running                  0                31h     10.104.20.164   4am-node22   <none>           <none>
level-zero-insert-op-11-2267-minio-2                              1/1     Running                  0                31h     10.104.23.119   4am-node27   <none>           <none>
level-zero-insert-op-11-2267-minio-3                              0/1     Pending                  0                27h     <none>          <none>       <none>           <none>
level-zero-insert-op-11-2267-pulsar-bookie-0                      1/1     Running                  0                31h     10.104.15.42    4am-node20   <none>           <none>
level-zero-insert-op-11-2267-pulsar-bookie-1                      1/1     Running                  0                31h     10.104.23.120   4am-node27   <none>           <none>
level-zero-insert-op-11-2267-pulsar-bookie-2                      1/1     Running                  0                31h     10.104.25.149   4am-node30   <none>           <none>
level-zero-insert-op-11-2267-pulsar-broker-0                      1/1     Running                  0                31h     10.104.13.94    4am-node16   <none>           <none>
level-zero-insert-op-11-2267-pulsar-proxy-0                       1/1     Running                  0                31h     10.104.5.195    4am-node12   <none>           <none>
level-zero-insert-op-11-2267-pulsar-recovery-0                    1/1     Running                  0                31h     10.104.1.91     4am-node10   <none>           <none>
level-zero-insert-op-11-2267-pulsar-zookeeper-0                   1/1     Running                  0                31h     10.104.15.43    4am-node20   <none>           <none>
level-zero-insert-op-11-2267-pulsar-zookeeper-1                   0/1     Pending                  0                27h     <none>          <none>       <none>           <none>
level-zero-insert-op-11-2267-pulsar-zookeeper-2                   1/1     Running                  0                31h     10.104.20.168   4am-node22   <none>           <none>

Anything else?

No response

ThreadDao avatar May 29 '24 15:05 ThreadDao