milvus
milvus copied to clipboard
[Bug]: [benchmark][cluster]Milvus load , search timed out and reported an error"rpc deadline exceeded:Retry timeout"
Is there an existing issue for this?
- [X] I have searched the existing issues
Environment
- Milvus version:2.1.0-20220708-11d21a3b
- Deployment mode(standalone or cluster):cluster
- SDK version(e.g. pymilvus v2.0.0rc2):pymilvus 2.1.0dev87
- OS(Ubuntu or CentOS):
- CPU/Memory:
- GPU:
- Others:
Current Behavior
server-instance benchmark-tag-no-clean-xkxrt-1 server-configmap server-cluster-8c64m-querynode2 client-configmap client-random-locust-search-filter-100m-ddl-r8-w2-replica2
2022-07-11 01:36:09,935] [ DEBUG] - Milvus get run in 9.379s (milvus_benchmark.client:54)
[2022-07-11 01:36:09,936] [ DEBUG] - Milvus get run in 9.3797s (milvus_benchmark.client:54)
[2022-07-11 01:36:09,936] [ DEBUG] - Milvus get run in 9.3805s (milvus_benchmark.client:54)
[2022-07-11 01:36:09,937] [ DEBUG] - Milvus query run in 9.3818s (milvus_benchmark.client:54)
[2022-07-11 01:36:09,938] [ DEBUG] - Milvus query run in 9.3826s (milvus_benchmark.client:54)
[2022-07-11 01:36:10,143] [ DEBUG] - params of load collection: {'timeout': 30, 'replica_number': 2} (milvus_benchmark.client:494)
[2022-07-11 01:36:10,860] [ DEBUG] - Milvus flush run in 1.2489s (milvus_benchmark.client:54)
[2022-07-11 01:36:10,861] [ DEBUG] - Milvus get_info run in 1.2493s (milvus_benchmark.client:54)
[2022-07-11 01:36:10,861] [ DEBUG] - [scene_test] Start insert : scene_test_7482_352821 (milvus_benchmark.client:543)
[2022-07-11 01:36:11,071] [ DEBUG] - Milvus get_info run in 1.4597s (milvus_benchmark.client:54)
[2022-07-11 01:36:11,073] [ DEBUG] - [scene_test] Start insert : scene_test_6506_300073 (milvus_benchmark.client:543)
[2022-07-11 01:36:11,077] [ DEBUG] - Milvus get_info run in 1.465s (milvus_benchmark.client:54)
[2022-07-11 01:39:17,942] [ DEBUG] - [scene_test] Start insert : scene_test_6386_825821 (milvus_benchmark.client:543)
[2022-07-11 01:39:18,191] [ DEBUG] - Milvus get_info run in 188.5794s (milvus_benchmark.client:54)
[2022-07-11 01:39:18,232] [ DEBUG] - [scene_test] Start insert : scene_test_9685_298548 (milvus_benchmark.client:543)
[2022-07-11 01:39:18,233] [ DEBUG] - Milvus query run in 188.6217s (milvus_benchmark.client:54)
[2022-07-11 01:39:18,373] [ ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-11 01:36:10.143693', 'RPC error': '2022-07-11 01:39:18.373718'}> (pymilvus.decorators:94)
[2022-07-11 01:39:18,374] [ ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-11 01:36:10.143632', 'RPC error': '2022-07-11 01:39:18.374575'}> (pymilvus.decorators:94)
[2022-07-11 01:39:18,375] [ ERROR] - RPC error: [load_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2022-07-11 01:36:10.757697', 'RPC error': '2022-07-11 01:39:18.375385'}> (pymilvus.decorators:94)
[2022-07-11 01:39:18,376] [ ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-11 01:36:10.758118', 'RPC error': '2022-07-11 01:39:18.376225'}> (pymilvus.decorators:94)
[2022-07-11 01:39:18,789] [ ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-11 01:36:10.758065', 'RPC error': '2022-07-11 01:39:18.789384'}> (pymilvus.decorators:94)
[2022-07-11 01:39:18,863] [ DEBUG] - Row count: 3000 in collection: <scene_test_8114_17319> (milvus_benchmark.client:428)
[2022-07-11 01:39:18,864] [ DEBUG] - [scene_test] Start create index : scene_test_8114_17319 (milvus_benchmark.client:550)
[2022-07-11 01:39:18,865] [ INFO] - Building index start, collection_name: scene_test_8114_17319, index_type: IVF_SQ8, metric_type: L2 (milvus_benchmark.client:279)
[2022-07-11 01:39:18,865] [ INFO] - {'nlist': 2048} (milvus_benchmark.client:281)
[2022-07-11 01:39:18,866] [ DEBUG] - collection: scene_test_8114_17319 Index params: {'index_type': 'IVF_SQ8', 'metric_type': 'L2', 'params': {'nlist': 2048}} (milvus_benchmark.client:287)
benchmark-tag-no-clean-xkxrt-1-etcd-0 1/1 Running 0 2d12h 10.97.17.42 qa-node014.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-etcd-1 1/1 Running 0 2d12h 10.97.16.114 qa-node013.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-etcd-2 1/1 Running 0 2d12h 10.97.16.115 qa-node013.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-milvus-datacoord-b99d5768f-jc827 1/1 Running 0 2d12h 10.97.18.178 qa-node017.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-milvus-datanode-77bbb45fbd-d4lhr 1/1 Running 0 2d12h 10.97.17.40 qa-node014.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-milvus-indexcoord-6bfbcdbd6lbpwl 1/1 Running 0 2d12h 10.97.14.124 qa-node011.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-milvus-indexnode-7bcc8975c92j8d8 1/1 Running 0 2d12h 10.97.16.111 qa-node013.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-milvus-proxy-dcd7fc968-txmsf 1/1 Running 0 2d12h 10.97.13.62 qa-node010.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-milvus-querycoord-74bc88db4t8zcc 1/1 Running 0 2d12h 10.97.14.125 qa-node011.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-milvus-querynode-565fcbcc85gnsxs 1/1 Running 0 2d12h 10.97.13.63 qa-node010.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-milvus-querynode-565fcbcc85j9r4p 1/1 Running 0 2d12h 10.97.20.144 qa-node018.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-milvus-rootcoord-7b965fc85c987qt 1/1 Running 1 2d12h 10.97.14.123 qa-node011.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-minio-0 1/1 Running 0 2d12h 10.97.19.159 qa-node016.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-minio-1 1/1 Running 0 2d12h 10.97.19.160 qa-node016.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-minio-2 1/1 Running 0 2d12h 10.97.12.109 qa-node015.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-minio-3 1/1 Running 0 2d12h 10.97.12.115 qa-node015.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-pulsar-bookie-0 1/1 Running 0 2d12h 10.97.19.162 qa-node016.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-pulsar-bookie-1 1/1 Running 0 2d12h 10.97.12.116 qa-node015.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-pulsar-bookie-2 1/1 Running 0 2d12h 10.97.12.117 qa-node015.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-pulsar-bookie-init-ttgsd 0/1 Completed 0 2d12h 10.97.10.211 qa-node008.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-pulsar-broker-0 1/1 Running 0 2d12h 10.97.10.212 qa-node008.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-pulsar-proxy-0 1/1 Running 0 2d12h 10.97.19.155 qa-node016.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-pulsar-pulsar-init-b4zt2 0/1 Completed 0 2d12h 10.97.5.65 qa-node003.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-pulsar-recovery-0 1/1 Running 0 2d12h 10.97.19.154 qa-node016.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-pulsar-zookeeper-0 1/1 Running 0 2d12h 10.97.5.67 qa-node003.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-pulsar-zookeeper-1 1/1 Running 0 2d12h 10.97.5.69 qa-node003.zilliz.local <none> <none>
benchmark-tag-no-clean-xkxrt-1-pulsar-zookeeper-2 1/1 Running 0 2d12h 10.97.10.214 qa-node008.zilliz.local <none> <none>
Expected Behavior
No response
Steps To Reproduce
1、create collection
2、create index of ivf_sq8
3、insert 100m million vectors
4、flush collection
5、build index with the same params
6、load collection
7、locust concurrent: query<-search, load, get<-query, scene_test 《- raise error
Milvus Log
No response
Anything else?
client-random-locust-search-filter-100m-ddl-r8-w2-replica2:
{
"config.yaml": "locust_random_performance:
collections:
-
collection_name: sift_100m_128_l2
# collection_name: sift_10w_128_l2
other_fields: float1
ni_per: 50000
build_index: true
index_type: ivf_sq8
index_param:
nlist: 2048
load_param:
replica_number: 2
task:
types:
-
type: query
weight: 20
params:
top_k: 10
nq: 10
search_param:
nprobe: 16
filters:
-
range: \"{'range': {'float1': {'GT': -1.0, 'LT': collection_size * 0.5}}}\"
-
type: load
weight: 1
params:
replica_number: 2
-
type: get
weight: 10
params:
ids_length: 10
-
type: scene_test
weight: 2
connection_num: 1
clients_num: 20
spawn_rate: 2
# during_time: 100
during_time: 84h
"
}
/assign @jiaoew1991 /unassign
server-instance benchmark-tag-no-clean-9ndwc-1 server-configmap server-single-32c128m client-configmap client-random-locust-100m-ddl-r8-w2
benchmark-tag-no-clean-9ndwc-1-etcd-0 1/1 Running 0 2d14h 10.97.16.117 qa-node013.zilliz.local <none> <none>
benchmark-tag-no-clean-9ndwc-1-milvus-standalone-795f84b4bsx6q9 1/1 Running 0 2d14h 10.97.20.146 qa-node018.zilliz.local <none> <none>
benchmark-tag-no-clean-9ndwc-1-minio-f75cd79fd-b72g8 1/1 Running 0 2d14h 10.97.19.164 qa-node016.zilliz.local <none> <none>
error:
[2022-07-11 01:14:03,334] [ DEBUG] - [scene_test] Start scene test : scene_test_9258_431323 (milvus_benchmark.client:536)
[2022-07-11 01:14:03,390] [ DEBUG] - Row count: 3000 in collection: <scene_test_8137_690539> (milvus_benchmark.client:428)
[2022-07-11 01:14:56,177] [ DEBUG] - [scene_test] Start create index : scene_test_8137_690539 (milvus_benchmark.client:550)
[2022-07-11 01:15:03,825] [ INFO] - Building index start, collection_name: scene_test_8137_690539, index_type: IVF_SQ8, metric_type: L2 (milvus_benchmark.client:279)
[2022-07-11 01:15:03,826] [ INFO] - {'nlist': 2048} (milvus_benchmark.client:281)
[2022-07-11 01:15:04,033] [ DEBUG] - collection: scene_test_8137_690539 Index params: {'index_type': 'IVF_SQ8', 'metric_type': 'L2', 'params': {'nlist': 2048}} (milvus_benchmark.client:287)
[2022-07-11 01:15:05,463] [ DEBUG] - Row count: 3000 in collection: <scene_test_6966_667923> (milvus_benchmark.client:428)
[2022-07-11 01:15:05,464] [ DEBUG] - [scene_test] Start create index : scene_test_6966_667923 (milvus_benchmark.client:550)
[2022-07-11 01:15:05,465] [ INFO] - Building index start, collection_name: scene_test_6966_667923, index_type: IVF_SQ8, metric_type: L2 (milvus_benchmark.client:279)
[2022-07-11 01:15:05,465] [ INFO] - {'nlist': 2048} (milvus_benchmark.client:281)
[2022-07-11 01:15:05,466] [ DEBUG] - collection: scene_test_6966_667923 Index params: {'index_type': 'IVF_SQ8', 'metric_type': 'L2', 'params': {'nlist': 2048}} (milvus_benchmark.client:287)
[2022-07-11 01:15:05,467] [ DEBUG] - Milvus get run in 62.9965s (milvus_benchmark.client:54)
[2022-07-11 01:15:05,523] [ ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-11 01:14:03.334084', 'RPC error': '2022-07-11 01:15:05.523187'}> (pymilvus.decorators:94)
[2022-07-11 01:15:05,524] [ ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-11 01:14:03.334046', 'RPC error': '2022-07-11 01:15:05.524140'}> (pymilvus.decorators:94)
[2022-07-11 01:15:05,525] [ ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-11 01:14:03.334173', 'RPC error': '2022-07-11 01:15:05.525835'}> (pymilvus.decorators:94)
[2022-07-11 01:15:05,526] [ ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-11 01:14:03.334150', 'RPC error': '2022-07-11 01:15:05.526516'}> (pymilvus.decorators:94)
[2022-07-11 01:15:05,527] [ INFO] - Create collection: <scene_test_9258_431323> successfully (milvus_benchmark.client:152)
[2022-07-11 01:15:05,528] [ DEBUG] - Milvus create_collection run in 62.1933s (milvus_benchmark.client:54)
[2022-07-11 01:15:05,529] [ DEBUG] - Milvus get run in 62.1956s (milvus_benchmark.client:54)
[2022-07-11 01:15:05,530] [ DEBUG] - [scene_test] Start scene test : scene_test_1798_516649 (milvus_benchmark.client:536)
[2022-07-11 01:15:05,605] [ DEBUG] - Milvus get run in 0.1371s (milvus_benchmark.client:54)
[2022-07-11 01:15:05,607] [ DEBUG] - Milvus get run in 62.2728s (milvus_benchmark.client:54)
server-instance fouram-tag-no-clean-g4m6q-1 server-configmap server-single-32c128m-redur0 client-configmap client-random-locust-100m-ddl-r8-w2-60h
2.1.0-20220819-671d629f pymilvus:2.1.1.dev3
Verified, the problem has not yet appeared, first close issue