milvus icon indicating copy to clipboard operation
milvus copied to clipboard

[Bug]: [benchmark][standalone] search、query、load raise error: _MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded

Open wangting0128 opened this issue 2 years ago • 27 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Environment

- Milvus version: master-20220610-36ad9895
- Deployment mode(standalone or cluster): standalone
- SDK version(e.g. pymilvus v2.0.0rc2): 2.1.0.dev70
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

argo task: benchmark-backup-q258f

test yaml: client-configmap:client-random-locust-insert-delete server-configmap:server-single-32c128m-compaction

server:

NAME                                                          READY   STATUS      RESTARTS   AGE   IP             NODE                      NOMINATED NODE   READINESS GATES
benchmark-backup-q258f-1-etcd-0                               1/1     Running     0          35m   10.97.16.148   qa-node013.zilliz.local   <none>           <none>
benchmark-backup-q258f-1-milvus-standalone-7d9fb4bd8c-dlln7   1/1     Running     0          35m   10.97.17.14    qa-node014.zilliz.local   <none>           <none>
benchmark-backup-q258f-1-minio-864bd5df4-8gbgw                1/1     Running     0          35m   10.97.19.209   qa-node016.zilliz.local   <none>           <none>

client pod: benchmark-backup-q258f-341429661

client log:

[2022-06-13 03:26:58,926] [    INFO] -  (locust.stats_logger:733)
[2022-06-13 03:26:58,927] [   DEBUG] - Milvus get run in 194.6004s (milvus_benchmark.client:54)
[2022-06-13 03:26:58,928] [   ERROR] - grpc RpcError: [_execute_search_requests], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:45.085524', 'gRPC error': '2022-06-13 03:26:58.928661'}> (pymilvus.decorators:86)
[2022-06-13 03:26:58,929] [   ERROR] - grpc RpcError: [search], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:43.992488', 'gRPC error': '2022-06-13 03:26:58.929541'}> (pymilvus.decorators:86)
[2022-06-13 03:26:58,930] [   ERROR] - grpc RpcError: [_execute_search_requests], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:45.085991', 'gRPC error': '2022-06-13 03:26:58.930694'}> (pymilvus.decorators:86)
[2022-06-13 03:26:58,931] [   ERROR] - grpc RpcError: [search], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:43.992607', 'gRPC error': '2022-06-13 03:26:58.931458'}> (pymilvus.decorators:86)
[2022-06-13 03:26:58,933] [   ERROR] - grpc RpcError: [_execute_search_requests], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:45.086275', 'gRPC error': '2022-06-13 03:26:58.933148'}> (pymilvus.decorators:86)
[2022-06-13 03:26:58,934] [   ERROR] - grpc RpcError: [search], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:43.992774', 'gRPC error': '2022-06-13 03:26:58.934021'}> (pymilvus.decorators:86)
[2022-06-13 03:26:58,936] [   ERROR] - grpc RpcError: [load_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:45.090003', 'gRPC error': '2022-06-13 03:26:58.936440'}> (pymilvus.decorators:86)
[2022-06-13 03:31:58,919] [   ERROR] - grpc RpcError: [has_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:45.090244', 'gRPC error': '2022-06-13 03:31:58.919293'}> (pymilvus.decorators:86)
[2022-06-13 03:31:59,001] [   ERROR] - grpc RpcError: [load_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:45.090323', 'gRPC error': '2022-06-13 03:31:59.001228'}> (pymilvus.decorators:86)
[2022-06-13 03:31:59,002] [   ERROR] - grpc RpcError: [has_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:45.090468', 'gRPC error': '2022-06-13 03:31:59.002465'}> (pymilvus.decorators:86)
[2022-06-13 03:31:59,003] [   ERROR] - grpc RpcError: [has_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:45.090718', 'gRPC error': '2022-06-13 03:31:59.003459'}> (pymilvus.decorators:86)
[2022-06-13 03:31:59,005] [   DEBUG] - Milvus load_collection run in 494.6779s (milvus_benchmark.client:54)
[2022-06-13 03:31:59,007] [   ERROR] - grpc RpcError: [_execute_search_requests], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:26:58.916830', 'gRPC error': '2022-06-13 03:31:59.007012'}> (pymilvus.decorators:86)
[2022-06-13 03:31:59,007] [   ERROR] - grpc RpcError: [search], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:44.019903', 'gRPC error': '2022-06-13 03:31:59.007747'}> (pymilvus.decorators:86)
[2022-06-13 03:31:59,008] [   ERROR] - grpc RpcError: [_execute_search_requests], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:26:58.917169', 'gRPC error': '2022-06-13 03:31:59.008632'}> (pymilvus.decorators:86)
[2022-06-13 03:31:59,009] [   ERROR] - grpc RpcError: [search], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:44.020039', 'gRPC error': '2022-06-13 03:31:59.009511'}> (pymilvus.decorators:86)
[2022-06-13 03:31:59,010] [   ERROR] - grpc RpcError: [_execute_search_requests], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:26:58.917442', 'gRPC error': '2022-06-13 03:31:59.010396'}> (pymilvus.decorators:86)
[2022-06-13 03:31:59,011] [   ERROR] - grpc RpcError: [search], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:44.020156', 'gRPC error': '2022-06-13 03:31:59.011019'}> (pymilvus.decorators:86)
[2022-06-13 03:31:59,011] [   ERROR] - grpc RpcError: [_execute_search_requests], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:26:58.917740', 'gRPC error': '2022-06-13 03:31:59.011951'}> (pymilvus.decorators:86)
[2022-06-13 03:31:59,012] [   ERROR] - grpc RpcError: [search], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:44.020264', 'gRPC error': '2022-06-13 03:31:59.012766'}> (pymilvus.decorators:86)
[2022-06-13 03:31:59,013] [   ERROR] - grpc RpcError: [_execute_search_requests], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:26:58.918033', 'gRPC error': '2022-06-13 03:31:59.013696'}> (pymilvus.decorators:86)
[2022-06-13 03:31:59,014] [   ERROR] - grpc RpcError: [search], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:44.020443', 'gRPC error': '2022-06-13 03:31:59.014345'}> (pymilvus.decorators:86)
[2022-06-13 03:31:59,015] [   ERROR] - grpc RpcError: [query], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:44.327738', 'gRPC error': '2022-06-13 03:31:59.015259'}> (pymilvus.decorators:86)
[2022-06-13 03:31:59,017] [   ERROR] - grpc RpcError: [has_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:26:58.935811', 'gRPC error': '2022-06-13 03:31:59.017381'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,007] [   ERROR] - grpc RpcError: [describe_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:26:58.936096', 'gRPC error': '2022-06-13 03:36:59.007089'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,008] [   ERROR] - RPC error: [bulk_insert], <MilvusException: (code=1, message=rpc timeout)>, <Time:{'RPC start': '2022-06-13 03:26:58.936075', 'RPC error': '2022-06-13 03:36:59.008732'}> (pymilvus.decorators:78)
[2022-06-13 03:36:59,009] [   ERROR] - <MilvusException: (code=1, message=rpc timeout)> (milvus_benchmark.client:167)
[2022-06-13 03:36:59,010] [   ERROR] - grpc RpcError: [has_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:26:58.936269', 'gRPC error': '2022-06-13 03:36:59.010585'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,011] [   ERROR] - grpc RpcError: [load_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:26:58.936346', 'gRPC error': '2022-06-13 03:36:59.011346'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,012] [   ERROR] - grpc RpcError: [_execute_search_requests], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:31:59.005090', 'gRPC error': '2022-06-13 03:36:59.012577'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,013] [   ERROR] - grpc RpcError: [search], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:44.327482', 'gRPC error': '2022-06-13 03:36:59.013134'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,013] [   ERROR] - grpc RpcError: [_execute_search_requests], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:31:59.005470', 'gRPC error': '2022-06-13 03:36:59.013947'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,014] [   ERROR] - grpc RpcError: [search], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:44.327621', 'gRPC error': '2022-06-13 03:36:59.014436'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,016] [   ERROR] - grpc RpcError: [_execute_search_requests], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:31:59.006886', 'gRPC error': '2022-06-13 03:36:59.016344'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,016] [   ERROR] - grpc RpcError: [search], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:44.327914', 'gRPC error': '2022-06-13 03:36:59.016918'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,017] [   ERROR] - grpc RpcError: [query], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:23:45.090552', 'gRPC error': '2022-06-13 03:36:59.017449'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,018] [    INFO] -  Name                                                                              # reqs      # fails  |     Avg     Min     Max  Median  |   req/s failures/s (locust.stats_logger:725)
[2022-06-13 03:36:59,019] [    INFO] - ---------------------------------------------------------------------------------------------------------------------------------------------------------------- (locust.stats_logger:727)
[2022-06-13 03:36:59,019] [    INFO] -  grpc delete                                                                         1573     0(0.00%)  |      97       3    1861      40  |    0.00    0.00 (locust.stats_logger:730)
[2022-06-13 03:36:59,019] [    INFO] -  grpc get                                                                            7639     2(0.03%)  |     348       3  793927      74  |    0.00    0.00 (locust.stats_logger:730)
[2022-06-13 03:36:59,020] [    INFO] -  grpc insert                                                                         1521     0(0.00%)  |     573      18  600074      98  |    0.00    0.00 (locust.stats_logger:730)
[2022-06-13 03:36:59,020] [    INFO] -  grpc load_collection                                                                2976     3(0.10%)  |     746       3  600075      72  |    0.00    0.00 (locust.stats_logger:730)
[2022-06-13 03:36:59,021] [    INFO] -  grpc query                                                                         15079    11(0.07%)  |     570       6  794689     110  |    0.00    0.00 (locust.stats_logger:730)
[2022-06-13 03:36:59,021] [    INFO] - ---------------------------------------------------------------------------------------------------------------------------------------------------------------- (locust.stats_logger:731)
[2022-06-13 03:36:59,021] [    INFO] -  Aggregated                                                                         28788    16(0.06%)  |     503       3  794689      95  |    0.00    0.00 (locust.stats_logger:732)
[2022-06-13 03:36:59,021] [    INFO] -  (locust.stats_logger:733)
[2022-06-13 03:36:59,024] [   ERROR] - grpc RpcError: [load_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:31:59.016139', 'gRPC error': '2022-06-13 03:36:59.024911'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,027] [   ERROR] - grpc RpcError: [has_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:31:59.016376', 'gRPC error': '2022-06-13 03:36:59.027859'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,028] [   ERROR] - grpc RpcError: [describe_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:31:59.016485', 'gRPC error': '2022-06-13 03:36:59.028309'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,028] [   ERROR] - grpc RpcError: [describe_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:31:59.016781', 'gRPC error': '2022-06-13 03:36:59.028702'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,029] [   ERROR] - RPC error: [bulk_insert], <MilvusException: (code=1, message=rpc timeout)>, <Time:{'RPC start': '2022-06-13 03:31:59.016766', 'RPC error': '2022-06-13 03:36:59.029031'}> (pymilvus.decorators:78)
[2022-06-13 03:36:59,029] [   ERROR] - <MilvusException: (code=1, message=rpc timeout)> (milvus_benchmark.client:167)
[2022-06-13 03:36:59,029] [   ERROR] - grpc RpcError: [has_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:31:59.016968', 'gRPC error': '2022-06-13 03:36:59.029634'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,030] [   ERROR] - grpc RpcError: [load_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:31:59.017074', 'gRPC error': '2022-06-13 03:36:59.029973'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,030] [   ERROR] - grpc RpcError: [load_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:31:59.017143', 'gRPC error': '2022-06-13 03:36:59.030333'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,030] [   ERROR] - grpc RpcError: [has_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-13 03:31:59.017292', 'gRPC error': '2022-06-13 03:36:59.030724'}> (pymilvus.decorators:86)
[2022-06-13 03:36:59,085] [   DEBUG] - Milvus query run in 300.1662s (milvus_benchmark.client:54)
[2022-06-13 03:36:59,109] [   DEBUG] - Milvus load_collection run in 0.0763s (milvus_benchmark.client:54)
[2022-06-13 03:36:59,110] [   DEBUG] - Milvus load_collection run in 0.0868s (milvus_benchmark.client:54)
[2022-06-13 03:36:59,113] [   DEBUG] - Milvus get run in 0.0854s (milvus_benchmark.client:54)
[2022-06-13 03:36:59,114] [   DEBUG] - Milvus query run in 794.0245s (milvus_benchmark.client:54)
[2022-06-13 03:36:59,148] [   DEBUG] - Milvus load_collection run in 0.1152s (milvus_benchmark.client:54)
[2022-06-13 03:36:59,149] [   DEBUG] - Milvus query run in 600.2135s (milvus_benchmark.client:54)
[2022-06-13 03:36:59,150] [   DEBUG] - Milvus query run in 0.1256s (milvus_benchmark.client:54)
[2022-06-13 03:36:59,150] [   DEBUG] - Milvus query run in 794.0605s (milvus_benchmark.client:54)
[2022-06-13 03:36:59,151] [   DEBUG] - Milvus query run in 0.127s (milvus_benchmark.client:54)
[2022-06-13 03:36:59,163] [   DEBUG] - Milvus query run in 0.139s (milvus_benchmark.client:54)
[2022-06-13 03:36:59,166] [   DEBUG] - Milvus query run in 0.1413s (milvus_benchmark.client:54)
[2022-06-13 03:36:59,175] [   DEBUG] - Milvus query run in 794.0845s (milvus_benchmark.client:54)
[2022-06-13 03:36:59,202] [   DEBUG] - Milvus get run in 0.0871s (milvus_benchmark.client:54)

Expected Behavior

No response

Steps To Reproduce

1、create collection
2、create index of ivf_sq8
3、insert 10w million vectors
4、flush collection
5、build index with the same params
6、load collection
7、locust concurrent: query<-search, load, get<-query, delete, insert 《- raise error

Milvus Log

No response

Anything else?

client-random-locust-insert-delete:

    locust_random_performance:
      collections:
        -
          collection_name: sift_10w_128_l2
          ni_per: 50000
          other_fields: float1
          build_index: true
          index_type: ivf_sq8
          index_param:
            nlist: 2048
          task:
            types:
              -
                type: query
                weight: 10
                params:
                  top_k: 10
                  nq: 10
                  search_param:
                    nprobe: 16
                  filters:
                    -
                      range: "{'range': {'float1': {'GT': collection_size * 0.5, 'LT': collection_size * 1}}}"
              -
                type: load
                weight: 2
              -
                type: get
                weight: 5
                params:
                  ids_length: 10
              -
                type: insert
                weight: 1
                params:
                  ni_per: 1000
              -
                type: delete
                weight: 1
                params:
                  ni_per: 100
            connection_num: 1
            clients_num: 20
            spawn_rate: 2
            during_time: 30m

wangting0128 avatar Jun 13 '22 04:06 wangting0128

/assign @xiaofan-luan /unassign

yanliang567 avatar Jun 13 '22 07:06 yanliang567

etcd log:

"level":"info","ts":"2022-06-13T03:52:27.417Z","caller":"osutil/interrupt_unix.go:64","msg":"received signal; shutting down","signal":"terminated"}

{"level":"info","ts":"2022-06-13T03:52:27.417Z","caller":"embed/etcd.go:367","msg":"closing etcd server","name":"benchmark-backup-q258f-1-etcd-0","data-dir":"/bitnami/etcd/data","advertise-peer-urls":["http://benchmark-backup-q258f-1-etcd-0.benchmark-backup-q258f-1-etcd-headless.qa-milvus.svc.cluster.local:2380"],"advertise-client-urls":["http://benchmark-backup-q258f-1-etcd-0.benchmark-backup-q258f-1-etcd-headless.qa-milvus.svc.cluster.local:2379"]}

WARNING: 2022/06/13 03:52:27 [core] grpc: addrConn.createTransport failed to connect to {0.0.0.0:2379 0.0.0.0:2379 0 }. Err: connection error: desc = "transport: Error while dialing dial tcp 0.0.0.0:2379: connect: connection refused". Reconnecting...

etcd was killed for no reason.

Standalone's log:

[2022/06/13 03:52:33.470 +00:00] [ERROR] [shard_segment_detector.go:132] ["failed to handle watch segment error, panic"] [error="rpc error: code = Canceled desc = latest balancer error: last connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup benchmark-backup-q258f-1-etcd on 10.96.0.10:53: server misbehaving""] [stack="github.com/milvus-io/milvus/internal/querynode.(*etcdShardSegmentDetector).watch\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/shard_segment_detector.go:132"]

panic: rpc error: code = Canceled desc = latest balancer error: last connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup benchmark-backup-q258f-1-etcd on 10.96.0.10:53: server misbehaving"

This image does not contain LD_PRELOAD jemalloc modifications.

czs007 avatar Jun 14 '22 03:06 czs007

etcd log:

"level":"info","ts":"2022-06-13T03:52:27.417Z","caller":"osutil/interrupt_unix.go:64","msg":"received signal; shutting down","signal":"terminated"}

{"level":"info","ts":"2022-06-13T03:52:27.417Z","caller":"embed/etcd.go:367","msg":"closing etcd server","name":"benchmark-backup-q258f-1-etcd-0","data-dir":"/bitnami/etcd/data","advertise-peer-urls":["http://benchmark-backup-q258f-1-etcd-0.benchmark-backup-q258f-1-etcd-headless.qa-milvus.svc.cluster.local:2380"],"advertise-client-urls":["http://benchmark-backup-q258f-1-etcd-0.benchmark-backup-q258f-1-etcd-headless.qa-milvus.svc.cluster.local:2379"]}

WARNING: 2022/06/13 03:52:27 [core] grpc: addrConn.createTransport failed to connect to {0.0.0.0:2379 0.0.0.0:2379 0 }. Err: connection error: desc = "transport: Error while dialing dial tcp 0.0.0.0:2379: connect: connection refused". Reconnecting...

etcd was killed for no reason.

Standalone's log:

[2022/06/13 03:52:33.470 +00:00] [ERROR] [shard_segment_detector.go:132] ["failed to handle watch segment error, panic"] [error="rpc error: code = Canceled desc = latest balancer error: last connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup benchmark-backup-q258f-1-etcd on 10.96.0.10:53: server misbehaving""] [stack="github.com/milvus-io/milvus/internal/querynode.(*etcdShardSegmentDetector).watch\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/shard_segment_detector.go:132"]

panic: rpc error: code = Canceled desc = latest balancer error: last connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup benchmark-backup-q258f-1-etcd on 10.96.0.10:53: server misbehaving"

This image does not contain LD_PRELOAD jemalloc modifications.

This is deliberately killing etcd during the cleanup process, It has nothing to do with the client error posted above.

czs007 avatar Jun 14 '22 06:06 czs007

@longjiquan help

czs007 avatar Jun 17 '22 08:06 czs007

Querynode's log indicate that the related timeout cases are ready to do, so it's not timetick caused this issue. It's hard to find what step the request finally ends, every log is same without msg id.

longjiquan avatar Jun 21 '22 07:06 longjiquan

/assign @xige-16

xige-16 avatar Jun 21 '22 09:06 xige-16

The log is gone, can you run it again? @wangting0128

xige-16 avatar Jun 21 '22 11:06 xige-16

server-instance test-etcd-no-clean-lmj2p-1 server-configmap server-single-32c128m-compaction client-configmap client-random-locust-insert-delete master-20220622-12158432 pymilvus 2.1.0dev78

test-etcd-no-clean-lmj2p-1-0                                    1/1     Running     0          2m3s   10.97.16.34    qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-lmj2p-1-milvus-standalone-77dd5bc9f6-dxs9x   1/1     Running     0          2m3s   10.97.17.145   qa-node014.zilliz.local   <none>           <none>
test-etcd-no-clean-lmj2p-1-minio-78b484c965-m28s9               1/1     Running     0          2m3s   10.97.16.35    qa-node013.zilliz.local   <none>           <none>

client log:

[2022-06-22 04:11:24,638] [    INFO] -  (locust.stats_logger:733)
[2022-06-22 04:11:24,639] [   ERROR] - grpc RpcError: [query], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:06:24.008924', 'gRPC error': '2022-06-22 04:11:24.639027'}> (pymilvus.decorators:86)
[2022-06-22 04:11:24,639] [   ERROR] - grpc RpcError: [query], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:06:24.009482', 'gRPC error': '2022-06-22 04:11:24.639526'}> (pymilvus.decorators:86)
[2022-06-22 04:11:24,640] [   ERROR] - grpc RpcError: [query], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:06:24.009545', 'gRPC error': '2022-06-22 04:11:24.640027'}> (pymilvus.decorators:86)
[2022-06-22 04:11:24,640] [   ERROR] - grpc RpcError: [query], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:06:24.009149', 'gRPC error': '2022-06-22 04:11:24.640505'}> (pymilvus.decorators:86)
[2022-06-22 04:11:24,642] [   ERROR] - grpc RpcError: [describe_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:06:24.670353', 'gRPC error': '2022-06-22 04:11:24.642844'}> (pymilvus.decorators:86)
[2022-06-22 04:11:24,643] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc timeout)>, <Time:{'RPC start': '2022-06-22 04:06:24.670281', 'RPC error': '2022-06-22 04:11:24.643233'}> (pymilvus.decorators:78)
[2022-06-22 04:16:24,632] [   ERROR] - grpc RpcError: [describe_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:06:24.670675', 'gRPC error': '2022-06-22 04:16:24.632766'}> (pymilvus.decorators:86)
[2022-06-22 04:16:24,633] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc timeout)>, <Time:{'RPC start': '2022-06-22 04:06:24.670627', 'RPC error': '2022-06-22 04:16:24.633603'}> (pymilvus.decorators:78)
[2022-06-22 04:16:24,634] [   ERROR] - grpc RpcError: [describe_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:06:24.670926', 'gRPC error': '2022-06-22 04:16:24.634404'}> (pymilvus.decorators:86)
[2022-06-22 04:16:24,634] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc timeout)>, <Time:{'RPC start': '2022-06-22 04:06:24.670895', 'RPC error': '2022-06-22 04:16:24.634710'}> (pymilvus.decorators:78)
[2022-06-22 04:16:24,635] [   ERROR] - grpc RpcError: [describe_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:06:24.671060', 'gRPC error': '2022-06-22 04:16:24.635050'}> (pymilvus.decorators:86)
[2022-06-22 04:16:24,635] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc timeout)>, <Time:{'RPC start': '2022-06-22 04:06:24.671035', 'RPC error': '2022-06-22 04:16:24.635438'}> (pymilvus.decorators:78)
[2022-06-22 04:16:24,636] [   ERROR] - grpc RpcError: [_execute_search_requests], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:11:24.630251', 'gRPC error': '2022-06-22 04:16:24.636157'}> (pymilvus.decorators:86)
[2022-06-22 04:16:24,636] [   ERROR] - grpc RpcError: [search], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:06:24.581308', 'gRPC error': '2022-06-22 04:16:24.636444'}> (pymilvus.decorators:86)
[2022-06-22 04:16:24,636] [   ERROR] - grpc RpcError: [_execute_search_requests], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:11:24.630703', 'gRPC error': '2022-06-22 04:16:24.636960'}> (pymilvus.decorators:86)
[2022-06-22 04:16:24,637] [   ERROR] - grpc RpcError: [search], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:06:24.581463', 'gRPC error': '2022-06-22 04:16:24.637234'}> (pymilvus.decorators:86)
[2022-06-22 04:16:24,637] [   ERROR] - grpc RpcError: [_execute_search_requests], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:11:24.631333', 'gRPC error': '2022-06-22 04:16:24.637705'}> (pymilvus.decorators:86)
[2022-06-22 04:16:24,638] [   ERROR] - grpc RpcError: [search], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:06:24.581738', 'gRPC error': '2022-06-22 04:16:24.638096'}> (pymilvus.decorators:86)
[2022-06-22 04:16:24,638] [   ERROR] - grpc RpcError: [query], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:06:24.581565', 'gRPC error': '2022-06-22 04:16:24.638650'}> (pymilvus.decorators:86)
[2022-06-22 04:16:24,639] [   ERROR] - grpc RpcError: [query], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:06:24.581632', 'gRPC error': '2022-06-22 04:16:24.639325'}> (pymilvus.decorators:86)
[2022-06-22 04:16:24,639] [   ERROR] - grpc RpcError: [query], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:06:24.581825', 'gRPC error': '2022-06-22 04:16:24.639850'}> (pymilvus.decorators:86)
[2022-06-22 04:16:24,642] [   ERROR] - grpc RpcError: [describe_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:11:24.641336', 'gRPC error': '2022-06-22 04:16:24.642108'}> (pymilvus.decorators:86)
[2022-06-22 04:16:24,642] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc timeout)>, <Time:{'RPC start': '2022-06-22 04:11:24.641263', 'RPC error': '2022-06-22 04:16:24.642788'}> (pymilvus.decorators:78)
[2022-06-22 04:21:24,638] [   ERROR] - grpc RpcError: [describe_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:11:24.641473', 'gRPC error': '2022-06-22 04:21:24.638096'}> (pymilvus.decorators:86)
[2022-06-22 04:21:24,694] [   ERROR] - grpc RpcError: [describe_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:11:24.641620', 'gRPC error': '2022-06-22 04:21:24.694670'}> (pymilvus.decorators:86)
[2022-06-22 04:21:24,695] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc timeout)>, <Time:{'RPC start': '2022-06-22 04:11:24.641592', 'RPC error': '2022-06-22 04:21:24.695871'}> (pymilvus.decorators:78)
[2022-06-22 04:21:24,697] [   ERROR] - grpc RpcError: [describe_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2022-06-22 04:11:24.642321', 'gRPC error': '2022-06-22 04:21:24.697444'}> (pymilvus.decorators:86)
[2022-06-22 04:21:24,702] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc timeout)>, <Time:{'RPC start': '2022-06-22 04:11:24.642288', 'RPC error': '2022-06-22 04:21:24.702184'}> (pymilvus.decorators:78)
[2022-06-22 04:21:24,703] [   ERROR] - grpc RpcError: [describe_collection], <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Dea

jingkl avatar Jun 22 '22 05:06 jingkl

The problem still occurs @xige-16

jingkl avatar Jun 22 '22 05:06 jingkl

image

longjiquan avatar Jun 22 '22 06:06 longjiquan

[2022/06/22 04:06:24.679 +00:00] [DEBUG] [impl.go:468] ["start do search"] [msgID=434076581765411043] [fromShardLeader=false] [vChannel=by-dev-rootcoord-dml_1_434076502637350209v1] [segmentIDs="[]"]
[2022/06/22 04:06:24.679 +00:00] [DEBUG] [impl.go:468] ["start do search"] [msgID=434076581765411043] [fromShardLeader=true] [vChannel=by-dev-rootcoord-dml_1_434076502637350209v1] [segmentIDs="[434076503423582210,434076506149879809]"]
[2022/06/22 04:06:24.679 +00:00] [DEBUG] [task_read.go:173] ["query msg can do"] [collectionID=434076502637350209] [sm.GuaranteeTimestamp=2022/06/22 04:06:19.631 +00:00] [serviceTime=2022/06/22 04:06:24.456 +00:00] ["delta milliseconds"=-4825] [channel=by-dev-rootcoord-delta_1_434076502637350209v1] [msgID=434076581765411043]

[2022/06/22 04:06:24.679 +00:00] [DEBUG] [task_read.go:173] ["query msg can do"] [collectionID=434076502637350209] [sm.GuaranteeTimestamp=2022/06/22 04:06:19.631 +00:00] [serviceTime=2022/06/22 04:06:24.456 +00:00] ["delta milliseconds"=-4825] [channel=by-dev-rootcoord-dml_1_434076502637350209v1] [msgID=434076581765411043]
[2022/06/22 04:24:32.000 +00:00] [DEBUG] [time_recorder.go:78] ["do search done, msgID = 434076581765411043, fromSharedLeader = true, vChannel = by-dev-rootcoord-dml_1_434076502637350209v1, segmentIDs = [434076503423582210 434076506149879809] (1087320ms)"]

[2022/06/22 04:24:32.000 +00:00] [DEBUG] [time_recorder.go:78] ["do search done, msgID = 434076581765411043, fromSharedLeader = false, vChannel = by-dev-rootcoord-dml_1_434076502637350209v1, segmentIDs = [] (1087321ms)"]

xige-16 avatar Jun 22 '22 07:06 xige-16

No conclusion yet

xige-16 avatar Jun 24 '22 11:06 xige-16

argo task:test-etcd-no-clean-pmx9g

test yaml: client-configmap: client-random-locust-insert-delete-60h server-configmap: server-single-16c64m-compaction

image:2.1.0-20220624-6c47ea2f server:

NAME                                                            READY   STATUS      RESTARTS   AGE   IP             NODE                      NOMINATED NODE   READINESS GATES
test-etcd-no-clean-pmx9g-1-0                                    1/1     Running     0          20h   10.97.17.53    qa-node014.zilliz.local   <none>           <none>
test-etcd-no-clean-pmx9g-1-milvus-standalone-6c8787566-kwzs5    1/1     Running     4          20h   10.97.16.241   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-pmx9g-1-minio-b7d9685d5-dqz2g                1/1     Running     0          20h   10.97.16.240   qa-node013.zilliz.local   <none>           <none>

client pod:test-etcd-no-clean-pmx9g-165860547 client log file: locust_report_2022-06-24_998.log client log:

attempt #3:fail to get shard leaders from QueryCoord: no replica available
attempt #4:fail to get shard leaders from QueryCoord: no replica available
attempt #5:fail to get shard leaders from QueryCoord: no replica available
attempt #6:fail to get shard leaders from QueryCoord: no replica available
attempt #7:fail to get shard leaders from QueryCoord: no replica available
attempt #8:context deadline exceeded
)>, <Time:{'RPC start': '2022-06-25 08:44:42.877509', 'RPC error': '2022-06-25 08:45:02.936532'}> (pymilvus.decorators:94)
[2022-06-25 08:45:02,937] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=fail to search on all shard leaders, err=All attempts results:
attempt #1:fail to get shard leaders from QueryCoord: no replica available
attempt #2:fail to get shard leaders from QueryCoord: no replica available
attempt #3:fail to get shard leaders from QueryCoord: no replica available
attempt #4:fail to get shard leaders from QueryCoord: no replica available
attempt #5:fail to get shard leaders from QueryCoord: no replica available
attempt #6:fail to get shard leaders from QueryCoord: no replica available
attempt #7:fail to get shard leaders from QueryCoord: no replica available
attempt #8:context deadline exceeded
)>, <Time:{'RPC start': '2022-06-25 08:44:32.741036', 'RPC error': '2022-06-25 08:45:02.937014'}> (pymilvus.decorators:94)
[2022-06-25 08:45:12,945] [   DEBUG] - Milvus delete run in 20.048s (milvus_benchmark.client:54)
[2022-06-25 08:45:12,955] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=fail to search on all shard leaders, err=All attempts results:
attempt #1:fail to get shard leaders from QueryCoord: no replica available
attempt #2:fail to get shard leaders from QueryCoord: no replica available
attempt #3:fail to get shard leaders from QueryCoord: no replica available
attempt #4:fail to get shard leaders from QueryCoord: no replica available
attempt #5:fail to get shard leaders from QueryCoord: no replica available
attempt #6:fail to get shard leaders from QueryCoord: no replica available
attempt #7:fail to get shard leaders from QueryCoord: no replica available
attempt #8:context deadline exceeded
)>, <Time:{'RPC start': '2022-06-25 08:44:52.897380', 'RPC error': '2022-06-25 08:45:12.955248'}> (pymilvus.decorators:94)
[2022-06-25 08:45:12,970] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=fail to search on all shard leaders, err=All attempts results:
attempt #1:fail to get shard leaders from QueryCoord: no replica available
attempt #2:fail to get shard leaders from QueryCoord: no replica available
attempt #3:fail to get shard leaders from QueryCoord: no replica available
attempt #4:fail to get shard leaders from QueryCoord: no replica available
attempt #5:fail to get shard leaders from QueryCoord: no replica available
attempt #6:fail to get shard leaders from QueryCoord: no replica available
attempt #7:fail to get shard leaders from QueryCoord: no replica available
attempt #8:context deadline exceeded
)>, <Time:{'RPC start': '2022-06-25 08:44:42.877801', 'RPC error': '2022-06-25 08:45:12.970591'}> (pymilvus.decorators:94)
[2022-06-25 08:45:12,975] [    INFO] -  Name                                                                              # reqs      # fails  |     Avg     Min     Max  Median  |   req/s failures/s (locust.stats_logger:725)
[2022-06-25 08:45:12,978] [    INFO] - ---------------------------------------------------------------------------------------------------------------------------------------------------------------- (locust.stats_logger:727)
[2022-06-25 08:45:13,024] [    INFO] -  grpc delete                                                                        38941     8(0.02%)  |    1209       1  600868     940  |    0.00    0.00 (locust.stats_logger:730)
[2022-06-25 08:45:13,036] [    INFO] -  grpc get                                                                          194292   552(0.28%)  |    1976       3 1561445    1500  |    0.10    0.10 (locust.stats_logger:730)
[2022-06-25 08:45:13,040] [    INFO] -  grpc insert                                                                        38829     0(0.00%)  |    1798      21  900751    1500  |    0.00    0.00 (locust.stats_logger:730)
[2022-06-25 08:45:13,042] [    INFO] -  grpc load_collection                                                               77421    15(0.02%)  |    1798       3  610725    1500  |    0.00    0.00 (locust.stats_logger:730)
[2022-06-25 08:45:13,043] [    INFO] -  grpc query                                                                        389330  1047(0.27%)  |    1959       7  610726    1700  |    0.40    0.40 (locust.stats_logger:730)
[2022-06-25 08:45:13,045] [    INFO] - ---------------------------------------------------------------------------------------------------------------------------------------------------------------- (locust.stats_logger:731)
[2022-06-25 08:45:13,062] [    INFO] -  Aggregated                                                                        738813  1622(0.22%)  |    1898       1 1561445    1600  |    0.50    0.50 (locust.stats_logger:732)
[2022-06-25 08:45:13,065] [    INFO] -  (locust.stats_logger:733)
[2022-06-25 08:45:13,066] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=fail to search on all shard leaders, err=All attempts results:
attempt #1:fail to get shard leaders from QueryCoord: no replica available
attempt #2:fail to get shard leaders from QueryCoord: no replica available
attempt #3:fail to get shard leaders from QueryCoord: no replica available
attempt #4:fail to get shard leaders from QueryCoord: no replica available
attempt #5:fail to get shard leaders from QueryCoord: no replica available
attempt #6:fail to get shard leaders from QueryCoord: no replica available
attempt #7:fail to get shard leaders from QueryCoord: no replica available
attempt #8:context deadline exceeded
)>, <Time:{'RPC start': '2022-06-25 08:44:42.877921', 'RPC error': '2022-06-25 08:45:13.066840'}> (pymilvus.decorators:94)
[2022-06-25 08:50:12,956] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656146772.945379761","description":"Error received from peer ipv4:10.96.59.21:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-25 08:44:52.897862', 'RPC error': '2022-06-25 08:50:12.956854'}> (pymilvus.decorators:94)
[2022-06-25 08:50:12,957] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656146772.945733318","description":"Error received from peer ipv4:10.96.59.21:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-25 08:44:52.898032', 'RPC error': '2022-06-25 08:50:12.957699'}> (pymilvus.decorators:94)
[2022-06-25 08:50:12,958] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656146772.945769942","description":"Error received from peer ipv4:10.96.59.21:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-25 08:44:52.898180', 'RPC error': '2022-06-25 08:50:12.958173'}> (pymilvus.decorators:94)
[2022-06-25 08:50:12,958] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656146772.954716250","description":"Error received from peer ipv4:10.96.59.21:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-25 08:44:52.898468', 'RPC error': '2022-06-25 08:50:12.958704'}> (pymilvus.decorators:94)
[2022-06-25 08:50:12,959] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656146772.954746074","description":"Error received from peer ipv4:10.96.59.21:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-25 08:44:52.898610', 'RPC error': '2022-06-25 08:50:12.959677'}> (pymilvus.decorators:94)
[2022-06-25 08:50:12,960] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656146772.955699927","description":"Error received from peer ipv4:10.96.59.21:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-25 08:44:52.898847', 'RPC error': '2022-06-25 08:50:12.960207'}> (pymilvus.decorators:94)
[2022-06-25 08:50:12,960] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656146773.074730004","description":"Error received from peer ipv4:10.96.59.21:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-25 08:45:02.937929', 'RPC error': '2022-06-25 08:50:12.960616'}> (pymilvus.decorators:94)
[2022-06-25 08:50:12,961] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-25 08:44:52.897543', 'RPC error': '2022-06-25 08:50:12.961136'}> (pymilvus.decorators:94)
[2022-06-25 08:50:12,961] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-25 08:44:52.898281', 'RPC error': '2022-06-25 08:50:12.961832'}> (pymilvus.decorators:94)
[2022-06-25 08:50:12,962] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-25 08:44:52.898353', 'RPC error': '2022-06-25 08:50:12.962247'}> (pymilvus.decorators:94)
[2022-06-25 08:50:12,962] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-25 08:44:52.898959', 'RPC error': '2022-06-25 08:50:12.962632'}> (pymilvus.decorators:94)
[2022-06-25 08:50:12,963] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-25 08:44:52.898723', 'RPC error': '2022-06-25 08:50:12.963047'}> (pymilvus.decorators:94)
[2022-06-25 08:50:12,965] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-25 08:45:13.072999', 'RPC error': '2022-06-25 08:50:12.965333'}> (pymilvus.decorators:94)
[2022-06-25 08:50:12,965] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-25 08:45:13.072940', 'RPC error': '2022-06-25 08:50:12.965647'}> (pymilvus.decorators:94)

client-random-locust-insert-delete-60h:

    locust_random_performance:
      collections:
        -
          collection_name: sift_10w_128_l2
          ni_per: 50000
          other_fields: float1
          build_index: true
          index_type: ivf_sq8
          index_param:
            nlist: 2048
          task:
            types:
              -
                type: query
                weight: 10
                params:
                  top_k: 10
                  nq: 10
                  search_param:
                    nprobe: 16
                  filters:
                    -
                      range: "{'range': {'float1': {'GT': collection_size * 0.5, 'LT': collection_size * 1}}}"
              -
                type: load
                weight: 2
              -
                type: get
                weight: 5
                params:
                  ids_length: 10
              -
                type: insert
                weight: 1
                params:
                  ni_per: 1000
              -
                type: delete
                weight: 1
                params:
                  ni_per: 100
            connection_num: 1
            clients_num: 20
            spawn_rate: 2
            during_time: 60h

wangting0128 avatar Jun 25 '22 08:06 wangting0128

argo test-etcd-no-clean-vkrrs server-configmap server-single-16c64m-compaction client-configmap client-random-locust-insert-delete-18h

2.1.0-20220625-4e2b2bfa

pymilvus 2.1.0dev87

test-etcd-no-clean-vkrrs-1-0                                    1/1     Running     0          40h     10.97.16.100   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-vkrrs-1-milvus-standalone-5c5c65845c-hb6ld   1/1     Running     4          40h     10.97.20.135   qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-vkrrs-1-minio-554cf5fd99-5cvk5               1/1     Running     0          40h     10.97.16.99    qa-node013.zilliz.local   <none>           <none>
[2022-06-26 02:51:17,087] [    INFO] -  (locust.stats_logger:733)
[2022-06-26 02:51:17,088] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 02:46:07.063754', 'RPC error': '2022-06-26 02:51:17.088165'}> (pymilvus.decorators:94)
[2022-06-26 02:51:17,088] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 02:45:57.040353', 'RPC error': '2022-06-26 02:51:17.088701'}> (pymilvus.decorators:94)
[2022-06-26 02:51:17,088] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656211637.073849372","description":"Error received from peer ipv4:10.96.202.253:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 02:45:57.040665', 'RPC error': '2022-06-26 02:51:17.088963'}> (pymilvus.decorators:94)
[2022-06-26 02:51:17,089] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656211637.073698591","description":"Error received from peer ipv4:10.96.202.253:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 02:45:57.040516', 'RPC error': '2022-06-26 02:51:17.089233'}> (pymilvus.decorators:94)
[2022-06-26 02:51:17,091] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076074', 'RPC error': '2022-06-26 02:51:17.091265'}> (pymilvus.decorators:94)
[2022-06-26 02:51:17,091] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076022', 'RPC error': '2022-06-26 02:51:17.091401'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,109] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076233', 'RPC error': '2022-06-26 02:52:32.109161'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,109] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076207', 'RPC error': '2022-06-26 02:52:32.109436'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,109] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076367', 'RPC error': '2022-06-26 02:52:32.109818'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,109] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076343', 'RPC error': '2022-06-26 02:52:32.109933'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,110] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076575', 'RPC error': '2022-06-26 02:52:32.110121'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,110] [   ERROR] - RPC error: [bulk_insert], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076559', 'RPC error': '2022-06-26 02:52:32.110243'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,110] [   ERROR] - <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)> (milvus_benchmark.client:169)
[2022-06-26 02:52:32,110] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076705', 'RPC error': '2022-06-26 02:52:32.110519'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,110] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076679', 'RPC error': '2022-06-26 02:52:32.110606'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,110] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076827', 'RPC error': '2022-06-26 02:52:32.110766'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,110] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076803', 'RPC error': '2022-06-26 02:52:32.110848'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,111] [ WARNING] - [93mRetry [describe_collection] No.1 in 0.2s, retry reason: <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>[0m (pymilvus.decorators:68)
[2022-06-26 02:52:32,111] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.077019', 'RPC error': '2022-06-26 02:52:32.111173'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,111] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076994', 'RPC error': '2022-06-26 02:52:32.111284'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,111] [   DEBUG] - Milvus load_collection run in 385.0473s (milvus_benchmark.client:54)
[2022-06-26 02:52:32,111] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656211937.083253920","description":"Error received from peer ipv4:10.96.202.253:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 02:46:07.064151', 'RPC error': '2022-06-26 02:52:32.111746'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,111] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656211937.084176278","description":"Error received from peer ipv4:10.96.202.253:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 02:46:07.064363', 'RPC error': '2022-06-26 02:52:32.111946'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,112] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656211937.085164011","description":"Error received from peer ipv4:10.96.202.253:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 02:46:07.064551', 'RPC error': '2022-06-26 02:52:32.112121'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,112] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656211937.086162995","description":"Error received from peer ipv4:10.96.202.253:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 02:46:07.063986', 'RPC error': '2022-06-26 02:52:32.112380'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,112] [   ERROR] - RPC error: [query], <MilvusUnavaliableException: (code=1, message=server unavaliable: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 02:46:07.064452', 'RPC error': '2022-06-26 02:52:32.112603'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,112] [   ERROR] - RPC error: [query], <MilvusUnavaliableException: (code=1, message=server unavaliable: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 02:46:07.064727', 'RPC error': '2022-06-26 02:52:32.112835'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,115] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:51:17.089783', 'RPC error': '2022-06-26 02:52:32.115314'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,115] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:51:17.089726', 'RPC error': '2022-06-26 02:52:32.115456'}> (pymilvus.decorators:94)
[2022-06-26 02:54:18,910] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:51:17.089996', 'RPC error': '2022-06-26 02:54:18.910941'}> (pymilvus.decorators:94)
[2022-06-26 02:54:18,911] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:51:17.089968', 'RPC error': '2022-06-26 02:54:18.911190'}> (pymilvus.decorators:94)
[2022-06-26 02:54:18,911] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:51:17.091020', 'RPC error': '2022-06-26 02:54:18.911784'}> (pymilvus.decorators:94)
[2022-06-26 02:54:18,911] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:51:17.090985', 'RPC error': '2022-06-26 02:54:18.911897'}> (pymilvus.decorators:94)
[2022-06-26 02:54:18,912] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:51:17.091157', 'RPC error': '2022-06-26 02:54:18.912065'}> (pymilvus.decorators:94)
[2022-06-26 02:54:18,912] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:51:17.091132', 'RPC error': '2022-06-26 02:54:18.912151'}> (pymilvus.decorators:94)
[2022-06-26 02:54:18,912] [   ERROR] - RPC error: [load_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2022-06-26 02:52:32.108866', 'RPC error': '2022-06-26 02:54:18.912319'}> (pymilvus.decorators:94)
[2022-06-26 02:54:18,912] [    INFO] -  Name         

jingkl avatar Jun 27 '22 02:06 jingkl

argo test-etcd-no-clean-wggls server-configmap server-cluster-dn2c10m-in8c32m-qn8c64m-compaction client-configmap client-random-locust-insert-delete-60h 2.1.0-20220624-6c47ea2f pymilvus 2.1.0dev87

test-etcd-no-clean-wggls-1-0                                    1/1     Running     0          2d13h   10.97.16.226   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-1                                    1/1     Running     0          2d13h   10.97.17.51    qa-node014.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-2                                    1/1     Running     0          2d13h   10.97.16.231   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-datacoord-849b96865-5qhdc     1/1     Running     0          2d13h   10.97.20.28    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-datanode-6cbc5c47d8-h9cgl     1/1     Running     0          2d13h   10.97.20.31    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-indexcoord-7db95b96b6-tkk5k   1/1     Running     0          2d13h   10.97.20.29    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-indexnode-85ff6847-cj8zr      1/1     Running     0          2d13h   10.97.20.37    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-proxy-6db5455b-qh5zd          1/1     Running     0          2d13h   10.97.20.32    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-querycoord-68f87c54bf-2xtb2   1/1     Running     0          2d13h   10.97.20.35    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-querynode-6c9cb85685-hhmwc    1/1     Running     0          2d13h   10.97.20.38    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-rootcoord-58794b6d69-bd4cb    1/1     Running     0          2d13h   10.97.20.30    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-minio-0                              1/1     Running     0          2d13h   10.97.20.43    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-minio-1                              1/1     Running     0          2d13h   10.97.16.218   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-minio-2                              1/1     Running     0          2d13h   10.97.20.44    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-minio-3                              1/1     Running     0          2d13h   10.97.20.45    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-bookie-0                      1/1     Running     0          2d13h   10.97.16.228   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-bookie-1                      1/1     Running     0          2d13h   10.97.16.220   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-bookie-2                      1/1     Running     0          2d13h   10.97.16.230   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-bookie-init-tld7k             0/1     Completed   0          2d13h   10.97.20.34    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-broker-0                      1/1     Running     0          2d13h   10.97.20.36    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-proxy-0                       1/1     Running     0          2d13h   10.97.20.39    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-pulsar-init-hxdlj             0/1     Completed   0          2d13h   10.97.20.33    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-recovery-0                    1/1     Running     0          2d13h   10.97.16.203   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-zookeeper-0                   1/1     Running     0          2d13h   10.97.16.215   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-zookeeper-1                   1/1     Running     0          2d13h   10.97.16.233   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-zookeeper-2                   1/1     Running     0          2d13h   10.97.20.48    qa-node018.zilliz.local   <none>           <none>
2022-06-26 07:08:56,900] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 07:03:56.877886', 'RPC error': '2022-06-26 07:08:56.900408'}> (pymilvus.decorators:94)
[2022-06-26 07:08:56,901] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 06:47:56.812773', 'RPC error': '2022-06-26 07:08:56.901111'}> (pymilvus.decorators:94)
[2022-06-26 07:08:56,902] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888285', 'RPC error': '2022-06-26 07:08:56.902565'}> (pymilvus.decorators:94)
[2022-06-26 07:08:56,904] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888230', 'RPC error': '2022-06-26 07:08:56.904700'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,894] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888508', 'RPC error': '2022-06-26 07:09:56.894811'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,895] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888467', 'RPC error': '2022-06-26 07:09:56.895521'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,896] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888717', 'RPC error': '2022-06-26 07:09:56.896142'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,896] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888666', 'RPC error': '2022-06-26 07:09:56.896646'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,897] [   DEBUG] - Milvus load_collection run in 360.0187s (milvus_benchmark.client:54)
[2022-06-26 07:09:56,897] [   DEBUG] - Milvus load_collection run in 360.0185s (milvus_benchmark.client:54)
[2022-06-26 07:09:56,899] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.890929149","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.878234', 'RPC error': '2022-06-26 07:09:56.899038'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,899] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.891887736","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.878610', 'RPC error': '2022-06-26 07:09:56.899535'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,900] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.891915221","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.879221', 'RPC error': '2022-06-26 07:09:56.900125'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,900] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.892885121","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.879033', 'RPC error': '2022-06-26 07:09:56.900632'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,901] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.892904077","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.879491', 'RPC error': '2022-06-26 07:09:56.901198'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,901] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.892916989","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.878839', 'RPC error': '2022-06-26 07:09:56.901816'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,902] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.892949317","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.879687', 'RPC error': '2022-06-26 07:09:56.902477'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,906] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.893873898","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.883874', 'RPC error': '2022-06-26 07:09:56.906748'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,910] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:08:56.901961', 'RPC error': '2022-06-26 07:09:56.909996'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,910] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:08:56.901904', 'RPC error': '2022-06-26 07:09:56.910443'}> (pymilvus.decorators:94)
[2022-06-26 07:14:56,901] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:08:56.902170', 'RPC error': '2022-06-26 07:14:56.900911'}> (pymilvus.decorators:94)
[2022-06-26 07:14:56,901] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:08:56.902132', 'RPC error': '2022-06-26 07:14:56.901697'}> (pymilvus.decorators:94)
[2022-06-26 07:14:56,903] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 06:53:56.832131', 'RPC error': '2022-06-26 07:14:56.903029'}> (pymilvus.decorators:94)
[2022-06-26 07:14:56,903] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 06:53:56.832366', 'RPC error': '2022-06-26 07:14:56.903938'}> (pymilvus.decorators:94)

jingkl avatar Jun 27 '22 02:06 jingkl

test-etcd-no-clean-pmx9g-1-milvus-standalone-6c8787566-kwzs5

milvus restart several times due to insufficient memory.

It seems that the query is not recovered after the restart, which is not the same problem as the occasional serach timeout in this issue. @wangting0128

1dFliXJoVh

xige-16 avatar Jun 27 '22 02:06 xige-16

argo test-etcd-no-clean-vkrrs server-configmap server-single-16c64m-compaction client-configmap client-random-locust-insert-delete-18h

2.1.0-20220625-4e2b2bfa

pymilvus 2.1.0dev87

test-etcd-no-clean-vkrrs-1-0                                    1/1     Running     0          40h     10.97.16.100   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-vkrrs-1-milvus-standalone-5c5c65845c-hb6ld   1/1     Running     4          40h     10.97.20.135   qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-vkrrs-1-minio-554cf5fd99-5cvk5               1/1     Running     0          40h     10.97.16.99    qa-node013.zilliz.local   <none>           <none>
[2022-06-26 02:51:17,087] [    INFO] -  (locust.stats_logger:733)
[2022-06-26 02:51:17,088] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 02:46:07.063754', 'RPC error': '2022-06-26 02:51:17.088165'}> (pymilvus.decorators:94)
[2022-06-26 02:51:17,088] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 02:45:57.040353', 'RPC error': '2022-06-26 02:51:17.088701'}> (pymilvus.decorators:94)
[2022-06-26 02:51:17,088] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656211637.073849372","description":"Error received from peer ipv4:10.96.202.253:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 02:45:57.040665', 'RPC error': '2022-06-26 02:51:17.088963'}> (pymilvus.decorators:94)
[2022-06-26 02:51:17,089] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656211637.073698591","description":"Error received from peer ipv4:10.96.202.253:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 02:45:57.040516', 'RPC error': '2022-06-26 02:51:17.089233'}> (pymilvus.decorators:94)
[2022-06-26 02:51:17,091] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076074', 'RPC error': '2022-06-26 02:51:17.091265'}> (pymilvus.decorators:94)
[2022-06-26 02:51:17,091] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076022', 'RPC error': '2022-06-26 02:51:17.091401'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,109] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076233', 'RPC error': '2022-06-26 02:52:32.109161'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,109] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076207', 'RPC error': '2022-06-26 02:52:32.109436'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,109] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076367', 'RPC error': '2022-06-26 02:52:32.109818'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,109] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076343', 'RPC error': '2022-06-26 02:52:32.109933'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,110] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076575', 'RPC error': '2022-06-26 02:52:32.110121'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,110] [   ERROR] - RPC error: [bulk_insert], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076559', 'RPC error': '2022-06-26 02:52:32.110243'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,110] [   ERROR] - <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)> (milvus_benchmark.client:169)
[2022-06-26 02:52:32,110] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076705', 'RPC error': '2022-06-26 02:52:32.110519'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,110] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076679', 'RPC error': '2022-06-26 02:52:32.110606'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,110] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076827', 'RPC error': '2022-06-26 02:52:32.110766'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,110] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076803', 'RPC error': '2022-06-26 02:52:32.110848'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,111] [ WARNING] - �[93mRetry [describe_collection] No.1 in 0.2s, retry reason: <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>�[0m (pymilvus.decorators:68)
[2022-06-26 02:52:32,111] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.077019', 'RPC error': '2022-06-26 02:52:32.111173'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,111] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:46:17.076994', 'RPC error': '2022-06-26 02:52:32.111284'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,111] [   DEBUG] - Milvus load_collection run in 385.0473s (milvus_benchmark.client:54)
[2022-06-26 02:52:32,111] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656211937.083253920","description":"Error received from peer ipv4:10.96.202.253:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 02:46:07.064151', 'RPC error': '2022-06-26 02:52:32.111746'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,111] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656211937.084176278","description":"Error received from peer ipv4:10.96.202.253:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 02:46:07.064363', 'RPC error': '2022-06-26 02:52:32.111946'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,112] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656211937.085164011","description":"Error received from peer ipv4:10.96.202.253:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 02:46:07.064551', 'RPC error': '2022-06-26 02:52:32.112121'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,112] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656211937.086162995","description":"Error received from peer ipv4:10.96.202.253:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 02:46:07.063986', 'RPC error': '2022-06-26 02:52:32.112380'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,112] [   ERROR] - RPC error: [query], <MilvusUnavaliableException: (code=1, message=server unavaliable: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 02:46:07.064452', 'RPC error': '2022-06-26 02:52:32.112603'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,112] [   ERROR] - RPC error: [query], <MilvusUnavaliableException: (code=1, message=server unavaliable: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 02:46:07.064727', 'RPC error': '2022-06-26 02:52:32.112835'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,115] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:51:17.089783', 'RPC error': '2022-06-26 02:52:32.115314'}> (pymilvus.decorators:94)
[2022-06-26 02:52:32,115] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:51:17.089726', 'RPC error': '2022-06-26 02:52:32.115456'}> (pymilvus.decorators:94)
[2022-06-26 02:54:18,910] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:51:17.089996', 'RPC error': '2022-06-26 02:54:18.910941'}> (pymilvus.decorators:94)
[2022-06-26 02:54:18,911] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:51:17.089968', 'RPC error': '2022-06-26 02:54:18.911190'}> (pymilvus.decorators:94)
[2022-06-26 02:54:18,911] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:51:17.091020', 'RPC error': '2022-06-26 02:54:18.911784'}> (pymilvus.decorators:94)
[2022-06-26 02:54:18,911] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:51:17.090985', 'RPC error': '2022-06-26 02:54:18.911897'}> (pymilvus.decorators:94)
[2022-06-26 02:54:18,912] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:51:17.091157', 'RPC error': '2022-06-26 02:54:18.912065'}> (pymilvus.decorators:94)
[2022-06-26 02:54:18,912] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 02:51:17.091132', 'RPC error': '2022-06-26 02:54:18.912151'}> (pymilvus.decorators:94)
[2022-06-26 02:54:18,912] [   ERROR] - RPC error: [load_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2022-06-26 02:52:32.108866', 'RPC error': '2022-06-26 02:54:18.912319'}> (pymilvus.decorators:94)
[2022-06-26 02:54:18,912] [    INFO] -  Name         

growing segment is not cleaned up during handoff, which leads to oom, and some searches will time out after restarting. The bug has been fixed.

K3x32jU8qT

xige-16 avatar Jun 28 '22 02:06 xige-16

/assign @wangting0128

xiaofan-luan avatar Jun 28 '22 03:06 xiaofan-luan

argo test-etcd-no-clean-wggls server-configmap server-cluster-dn2c10m-in8c32m-qn8c64m-compaction client-configmap client-random-locust-insert-delete-60h 2.1.0-20220624-6c47ea2f pymilvus 2.1.0dev87

test-etcd-no-clean-wggls-1-0                                    1/1     Running     0          2d13h   10.97.16.226   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-1                                    1/1     Running     0          2d13h   10.97.17.51    qa-node014.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-2                                    1/1     Running     0          2d13h   10.97.16.231   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-datacoord-849b96865-5qhdc     1/1     Running     0          2d13h   10.97.20.28    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-datanode-6cbc5c47d8-h9cgl     1/1     Running     0          2d13h   10.97.20.31    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-indexcoord-7db95b96b6-tkk5k   1/1     Running     0          2d13h   10.97.20.29    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-indexnode-85ff6847-cj8zr      1/1     Running     0          2d13h   10.97.20.37    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-proxy-6db5455b-qh5zd          1/1     Running     0          2d13h   10.97.20.32    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-querycoord-68f87c54bf-2xtb2   1/1     Running     0          2d13h   10.97.20.35    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-querynode-6c9cb85685-hhmwc    1/1     Running     0          2d13h   10.97.20.38    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-rootcoord-58794b6d69-bd4cb    1/1     Running     0          2d13h   10.97.20.30    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-minio-0                              1/1     Running     0          2d13h   10.97.20.43    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-minio-1                              1/1     Running     0          2d13h   10.97.16.218   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-minio-2                              1/1     Running     0          2d13h   10.97.20.44    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-minio-3                              1/1     Running     0          2d13h   10.97.20.45    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-bookie-0                      1/1     Running     0          2d13h   10.97.16.228   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-bookie-1                      1/1     Running     0          2d13h   10.97.16.220   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-bookie-2                      1/1     Running     0          2d13h   10.97.16.230   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-bookie-init-tld7k             0/1     Completed   0          2d13h   10.97.20.34    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-broker-0                      1/1     Running     0          2d13h   10.97.20.36    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-proxy-0                       1/1     Running     0          2d13h   10.97.20.39    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-pulsar-init-hxdlj             0/1     Completed   0          2d13h   10.97.20.33    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-recovery-0                    1/1     Running     0          2d13h   10.97.16.203   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-zookeeper-0                   1/1     Running     0          2d13h   10.97.16.215   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-zookeeper-1                   1/1     Running     0          2d13h   10.97.16.233   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-zookeeper-2                   1/1     Running     0          2d13h   10.97.20.48    qa-node018.zilliz.local   <none>           <none>
2022-06-26 07:08:56,900] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 07:03:56.877886', 'RPC error': '2022-06-26 07:08:56.900408'}> (pymilvus.decorators:94)
[2022-06-26 07:08:56,901] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 06:47:56.812773', 'RPC error': '2022-06-26 07:08:56.901111'}> (pymilvus.decorators:94)
[2022-06-26 07:08:56,902] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888285', 'RPC error': '2022-06-26 07:08:56.902565'}> (pymilvus.decorators:94)
[2022-06-26 07:08:56,904] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888230', 'RPC error': '2022-06-26 07:08:56.904700'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,894] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888508', 'RPC error': '2022-06-26 07:09:56.894811'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,895] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888467', 'RPC error': '2022-06-26 07:09:56.895521'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,896] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888717', 'RPC error': '2022-06-26 07:09:56.896142'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,896] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888666', 'RPC error': '2022-06-26 07:09:56.896646'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,897] [   DEBUG] - Milvus load_collection run in 360.0187s (milvus_benchmark.client:54)
[2022-06-26 07:09:56,897] [   DEBUG] - Milvus load_collection run in 360.0185s (milvus_benchmark.client:54)
[2022-06-26 07:09:56,899] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.890929149","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.878234', 'RPC error': '2022-06-26 07:09:56.899038'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,899] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.891887736","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.878610', 'RPC error': '2022-06-26 07:09:56.899535'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,900] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.891915221","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.879221', 'RPC error': '2022-06-26 07:09:56.900125'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,900] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.892885121","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.879033', 'RPC error': '2022-06-26 07:09:56.900632'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,901] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.892904077","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.879491', 'RPC error': '2022-06-26 07:09:56.901198'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,901] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.892916989","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.878839', 'RPC error': '2022-06-26 07:09:56.901816'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,902] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.892949317","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.879687', 'RPC error': '2022-06-26 07:09:56.902477'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,906] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.893873898","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.883874', 'RPC error': '2022-06-26 07:09:56.906748'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,910] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:08:56.901961', 'RPC error': '2022-06-26 07:09:56.909996'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,910] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:08:56.901904', 'RPC error': '2022-06-26 07:09:56.910443'}> (pymilvus.decorators:94)
[2022-06-26 07:14:56,901] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:08:56.902170', 'RPC error': '2022-06-26 07:14:56.900911'}> (pymilvus.decorators:94)
[2022-06-26 07:14:56,901] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:08:56.902132', 'RPC error': '2022-06-26 07:14:56.901697'}> (pymilvus.decorators:94)
[2022-06-26 07:14:56,903] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 06:53:56.832131', 'RPC error': '2022-06-26 07:14:56.903029'}> (pymilvus.decorators:94)
[2022-06-26 07:14:56,903] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 06:53:56.832366', 'RPC error': '2022-06-26 07:14:56.903938'}> (pymilvus.decorators:94)

querynode's tsafe stopped advancing around 6-22 04:07

xige-16 avatar Jun 28 '22 04:06 xige-16

argo test-etcd-no-clean-wggls server-configmap server-cluster-dn2c10m-in8c32m-qn8c64m-compaction client-configmap client-random-locust-insert-delete-60h 2.1.0-20220624-6c47ea2f pymilvus 2.1.0dev87

test-etcd-no-clean-wggls-1-0                                    1/1     Running     0          2d13h   10.97.16.226   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-1                                    1/1     Running     0          2d13h   10.97.17.51    qa-node014.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-2                                    1/1     Running     0          2d13h   10.97.16.231   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-datacoord-849b96865-5qhdc     1/1     Running     0          2d13h   10.97.20.28    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-datanode-6cbc5c47d8-h9cgl     1/1     Running     0          2d13h   10.97.20.31    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-indexcoord-7db95b96b6-tkk5k   1/1     Running     0          2d13h   10.97.20.29    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-indexnode-85ff6847-cj8zr      1/1     Running     0          2d13h   10.97.20.37    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-proxy-6db5455b-qh5zd          1/1     Running     0          2d13h   10.97.20.32    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-querycoord-68f87c54bf-2xtb2   1/1     Running     0          2d13h   10.97.20.35    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-querynode-6c9cb85685-hhmwc    1/1     Running     0          2d13h   10.97.20.38    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-milvus-rootcoord-58794b6d69-bd4cb    1/1     Running     0          2d13h   10.97.20.30    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-minio-0                              1/1     Running     0          2d13h   10.97.20.43    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-minio-1                              1/1     Running     0          2d13h   10.97.16.218   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-minio-2                              1/1     Running     0          2d13h   10.97.20.44    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-minio-3                              1/1     Running     0          2d13h   10.97.20.45    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-bookie-0                      1/1     Running     0          2d13h   10.97.16.228   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-bookie-1                      1/1     Running     0          2d13h   10.97.16.220   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-bookie-2                      1/1     Running     0          2d13h   10.97.16.230   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-bookie-init-tld7k             0/1     Completed   0          2d13h   10.97.20.34    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-broker-0                      1/1     Running     0          2d13h   10.97.20.36    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-proxy-0                       1/1     Running     0          2d13h   10.97.20.39    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-pulsar-init-hxdlj             0/1     Completed   0          2d13h   10.97.20.33    qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-recovery-0                    1/1     Running     0          2d13h   10.97.16.203   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-zookeeper-0                   1/1     Running     0          2d13h   10.97.16.215   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-zookeeper-1                   1/1     Running     0          2d13h   10.97.16.233   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-wggls-1-pulsar-zookeeper-2                   1/1     Running     0          2d13h   10.97.20.48    qa-node018.zilliz.local   <none>           <none>
2022-06-26 07:08:56,900] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 07:03:56.877886', 'RPC error': '2022-06-26 07:08:56.900408'}> (pymilvus.decorators:94)
[2022-06-26 07:08:56,901] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 06:47:56.812773', 'RPC error': '2022-06-26 07:08:56.901111'}> (pymilvus.decorators:94)
[2022-06-26 07:08:56,902] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888285', 'RPC error': '2022-06-26 07:08:56.902565'}> (pymilvus.decorators:94)
[2022-06-26 07:08:56,904] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888230', 'RPC error': '2022-06-26 07:08:56.904700'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,894] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888508', 'RPC error': '2022-06-26 07:09:56.894811'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,895] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888467', 'RPC error': '2022-06-26 07:09:56.895521'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,896] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888717', 'RPC error': '2022-06-26 07:09:56.896142'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,896] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:03:56.888666', 'RPC error': '2022-06-26 07:09:56.896646'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,897] [   DEBUG] - Milvus load_collection run in 360.0187s (milvus_benchmark.client:54)
[2022-06-26 07:09:56,897] [   DEBUG] - Milvus load_collection run in 360.0185s (milvus_benchmark.client:54)
[2022-06-26 07:09:56,899] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.890929149","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.878234', 'RPC error': '2022-06-26 07:09:56.899038'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,899] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.891887736","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.878610', 'RPC error': '2022-06-26 07:09:56.899535'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,900] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.891915221","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.879221', 'RPC error': '2022-06-26 07:09:56.900125'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,900] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.892885121","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.879033', 'RPC error': '2022-06-26 07:09:56.900632'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,901] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.892904077","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.879491', 'RPC error': '2022-06-26 07:09:56.901198'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,901] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.892916989","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.878839', 'RPC error': '2022-06-26 07:09:56.901816'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,902] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.892949317","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.879687', 'RPC error': '2022-06-26 07:09:56.902477'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,906] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656227396.893873898","description":"Error received from peer ipv4:10.96.53.65:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-26 07:03:56.883874', 'RPC error': '2022-06-26 07:09:56.906748'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,910] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:08:56.901961', 'RPC error': '2022-06-26 07:09:56.909996'}> (pymilvus.decorators:94)
[2022-06-26 07:09:56,910] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:08:56.901904', 'RPC error': '2022-06-26 07:09:56.910443'}> (pymilvus.decorators:94)
[2022-06-26 07:14:56,901] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:08:56.902170', 'RPC error': '2022-06-26 07:14:56.900911'}> (pymilvus.decorators:94)
[2022-06-26 07:14:56,901] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-26 07:08:56.902132', 'RPC error': '2022-06-26 07:14:56.901697'}> (pymilvus.decorators:94)
[2022-06-26 07:14:56,903] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 06:53:56.832131', 'RPC error': '2022-06-26 07:14:56.903029'}> (pymilvus.decorators:94)
[2022-06-26 07:14:56,903] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-26 06:53:56.832366', 'RPC error': '2022-06-26 07:14:56.903938'}> (pymilvus.decorators:94)

querynode's tsafe stopped advancing around 6-22 04:07

rootcoord stopped sending timetick due to backlog quota exceeded

time="2022-06-26T04:07:49Z" level=info msg="Broker notification of Closed producer: 2" local_addr="10.97.20.30:59148" remote_addr="pulsar://test-etcd-no-clean-wggls-1-pulsar-proxy:6650"

time="2022-06-26T04:07:49Z" level=warning msg="[Connection was closed]" cnx="10.97.20.30:59148 -> 10.96.88.79:6650" producerID=2 producer_name=test-etcd-no-clean-wggls-1-pulsar-0-1 topic="persistent://public/default/by-dev-rootcoord-dml_1"

time="2022-06-26T04:07:49Z" level=info msg="[Reconnecting to broker in  106.244432ms]" producerID=2 producer_name=test-etcd-no-clean-wggls-1-pulsar-0-1 topic="persistent://public/default/by-dev-rootcoord-dml_1"

time="2022-06-26T04:07:56Z" level=error msg="[Failed to create producer]" error="server error: ProducerBlockedQuotaExceededException: Cannot create producer on topic with backlog quota exceeded" producerID=2 producer_name=test-etcd-no-clean-wggls-1-pulsar-0-1 topic="persistent://public/default/by-dev-rootcoord-dml_1"

xige-16 avatar Jun 28 '22 05:06 xige-16

server-instance test-etcd-no-clean-6w7xr-1 server-configmap server-cluster-dn2c10m-in8c32m-qn8c64m-compaction client-configmap client-random-locust-insert-delete-60h

sunby-debug_query_hang-7339f3f94-20220628 pymilvus 2.1.0dev87

client log

[2022-06-29 02:44:11,884] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:39:11.600772', 'RPC error': '2022-06-29 02:44:11.884548'}> (pymilvus.decorators:94)
[2022-06-29 02:44:11,886] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:39:11.600682', 'RPC error': '2022-06-29 02:44:11.886103'}> (pymilvus.decorators:94)
[2022-06-29 02:44:11,887] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:39:11.601605', 'RPC error': '2022-06-29 02:44:11.887868'}> (pymilvus.decorators:94)
[2022-06-29 02:44:11,889] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:39:11.601542', 'RPC error': '2022-06-29 02:44:11.889365'}> (pymilvus.decorators:94)
[2022-06-29 02:44:11,891] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:39:11.601925', 'RPC error': '2022-06-29 02:44:11.891258'}> (pymilvus.decorators:94)
[2022-06-29 02:44:11,892] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:39:11.601881', 'RPC error': '2022-06-29 02:44:11.892678'}> (pymilvus.decorators:94)
[2022-06-29 02:44:11,894] [   ERROR] - RPC error: [load_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2022-06-29 02:39:11.602139', 'RPC error': '2022-06-29 02:44:11.894404'}> (pymilvus.decorators:94)
[2022-06-29 02:44:11,895] [   ERROR] - RPC error: [delete], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-29 02:39:11.604104', 'RPC error': '2022-06-29 02:44:11.895064'}> (pymilvus.decorators:94)
[2022-06-29 02:44:11,895] [   ERROR] - RPC error: [load_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 30s)>, <Time:{'RPC start': '2022-06-29 02:39:11.604306', 'RPC error': '2022-06-29 02:44:11.895689'}> (pymilvus.decorators:94)
[2022-06-29 02:44:11,896] [ WARNING] - [93mRetry [describe_collection] No.1 in 0.2s, retry reason: <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>[0m (pymilvus.decorators:68)
[2022-06-29 02:44:11,897] [   ERROR] - RPC error: [delete], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-29 02:39:11.604559', 'RPC error': '2022-06-29 02:44:11.897631'}> (pymilvus.decorators:94)
[2022-06-29 02:49:11,902] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656470711.903140248","description":"Error received from peer ipv4:10.96.209.233:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-29 02:44:11.841818', 'RPC error': '2022-06-29 02:49:11.902620'}> (pymilvus.decorators:94)
[2022-06-29 02:49:11,904] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-29 02:44:11.580099', 'RPC error': '2022-06-29 02:49:11.904433'}> (pymilvus.decorators:94)
[2022-06-29 02:49:11,906] [ WARNING] - [93mRetry [describe_collection] No.1 in 0.2s, retry reason: <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>[0m (pymilvus.decorators:68)
[2022-06-29 02:50:11,903] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.899599', 'RPC error': '2022-06-29 02:50:11.903861'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,926] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.899541', 'RPC error': '2022-06-29 02:50:11.926227'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,930] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-29 02:44:11.899877', 'RPC error': '2022-06-29 02:50:11.930566'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,934] [   ERROR] - RPC error: [bulk_insert], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-06-29 02:44:11.899851', 'RPC error': '2022-06-29 02:50:11.934161'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,938] [   ERROR] - <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)> (milvus_benchmark.client:169)
[2022-06-29 02:50:11,942] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.900115', 'RPC error': '2022-06-29 02:50:11.942400'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,943] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.900062', 'RPC error': '2022-06-29 02:50:11.943170'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,943] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.900344', 'RPC error': '2022-06-29 02:50:11.943944'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,944] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.900306', 'RPC error': '2022-06-29 02:50:11.944630'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,945] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.900912', 'RPC error': '2022-06-29 02:50:11.945488'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,946] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.900507', 'RPC error': '2022-06-29 02:50:11.946195'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,947] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.901166', 'RPC error': '2022-06-29 02:50:11.947060'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,947] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.901127', 'RPC error': '2022-06-29 02:50:11.947656'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,948] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.901395', 'RPC error': '2022-06-29 02:50:11.948298'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,948] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.901340', 'RPC error': '2022-06-29 02:50:11.948827'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,949] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.901626', 'RPC error': '2022-06-29 02:50:11.949697'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,957] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.901565', 'RPC error': '2022-06-29 02:50:11.956953'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,957] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.901819', 'RPC error': '2022-06-29 02:50:11.957752'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,958] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.901781', 'RPC error': '2022-06-29 02:50:11.958389'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,959] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.902082', 'RPC error': '2022-06-29 02:50:11.959388'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,960] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-06-29 02:44:11.901998', 'RPC error': '2022-06-29 02:50:11.960483'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,961] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656471011.903112780","description":"Error received from peer ipv4:10.96.209.233:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-29 02:44:11.842256', 'RPC error': '2022-06-29 02:50:11.961600'}> (pymilvus.decorators:94)
[2022-06-29 02:50:11,962] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656471011.903164594","description":"Error received from peer ipv4:10.96.209.233:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-06-29 02:44:11.842545', 'RPC error': '2022-06-29 02:50:11.962794'}> (pymilvus.decorators:94)

server:

test-etcd-no-clean-6w7xr-1-0                                    1/1     Running     0          18h     10.97.16.146   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-1                                    1/1     Running     0          18h     10.97.17.142   qa-node014.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-2                                    1/1     Running     0          18h     10.97.16.151   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-milvus-datacoord-5d9cb48769-56sqg    1/1     Running     0          18h     10.97.4.94     qa-node002.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-milvus-datanode-86ccc6bd6d-c5z4w     1/1     Running     0          18h     10.97.20.181   qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-milvus-indexcoord-54769fbfc8-9hxzt   1/1     Running     0          18h     10.97.4.92     qa-node002.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-milvus-indexnode-6d796bd479-9k2kt    1/1     Running     0          18h     10.97.13.106   qa-node010.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-milvus-proxy-7d7c98c7b8-tjz6k        1/1     Running     0          18h     10.97.13.105   qa-node010.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-milvus-querycoord-797c677f4f-pqdlh   1/1     Running     0          18h     10.97.4.93     qa-node002.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-milvus-querynode-599f85f96f-sh76s    1/1     Running     11         18h     10.97.14.200   qa-node011.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-milvus-rootcoord-7699d7645c-khbfl    1/1     Running     0          18h     10.97.4.91     qa-node002.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-minio-0                              1/1     Running     0          18h     10.97.19.132   qa-node016.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-minio-1                              1/1     Running     0          18h     10.97.12.152   qa-node015.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-minio-2                              1/1     Running     0          18h     10.97.12.155   qa-node015.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-minio-3                              1/1     Running     0          18h     10.97.19.135   qa-node016.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-pulsar-bookie-0                      1/1     Running     0          18h     10.97.19.133   qa-node016.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-pulsar-bookie-1                      1/1     Running     0          18h     10.97.12.156   qa-node015.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-pulsar-bookie-2                      1/1     Running     0          18h     10.97.16.152   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-pulsar-bookie-init-8hjdw             0/1     Completed   0          18h     10.97.18.107   qa-node017.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-pulsar-broker-0                      1/1     Running     0          18h     10.97.20.180   qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-pulsar-proxy-0                       1/1     Running     0          18h     10.97.16.144   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-pulsar-pulsar-init-qjnj5             0/1     Completed   0          18h     10.97.19.128   qa-node016.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-pulsar-recovery-0                    1/1     Running     0          18h     10.97.16.143   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-pulsar-zookeeper-0                   1/1     Running     0          18h     10.97.5.170    qa-node003.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-pulsar-zookeeper-1                   1/1     Running     0          18h     10.97.19.137   qa-node016.zilliz.local   <none>           <none>
test-etcd-no-clean-6w7xr-1-pulsar-zookeeper-2                   1/1     Running     0          18h     10.97.10.14    qa-node008.zilliz.local   <none>           <none>

jingkl avatar Jun 29 '22 02:06 jingkl

There are 2 problems that lead to search timeout:

  1. Msgstream Close() could block because of the full buffer. I have fixed it in #17909
  2. The runtime.NumCPU will get the cpu number of the host and there may be too much search task executed concurrently. @czs007 is fixing it.

sunby avatar Jun 30 '22 06:06 sunby

/assign @czs007 /unassign @longjiquan /unassign @wangting0128

xige-16 avatar Jul 01 '22 02:07 xige-16

server-instance test-etcd-no-clean-k646h-1 server-configmap server-cluster-dn2c10m-in8c32m-qn8c64m-compaction client-configmap client-random-locust-insert-delete-18h pymilvus 2.1.0dev87 2.1.0-20220630-a875e755

test-etcd-no-clean-k646h-1-0                                    1/1     Running     0          17h     10.97.17.21    qa-node014.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-1                                    1/1     Running     0          17h     10.97.16.154   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-2                                    1/1     Running     0          17h     10.97.17.23    qa-node014.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-milvus-datacoord-7dcb479cf6-ct74z    1/1     Running     1          17h     10.97.10.232   qa-node008.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-milvus-datanode-bf667b87c-9n4t2      1/1     Running     1          17h     10.97.14.60    qa-node011.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-milvus-indexcoord-6f9dc6495-lbghb    1/1     Running     1          17h     10.97.10.230   qa-node008.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-milvus-indexnode-66bc589cfd-vn2cq    1/1     Running     0          17h     10.97.20.160   qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-milvus-proxy-bbdb76ccf-b6qvw         1/1     Running     0          17h     10.97.10.229   qa-node008.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-milvus-querycoord-669bbf696d-dr8dn   1/1     Running     1          17h     10.97.14.59    qa-node011.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-milvus-querynode-5555dfc776-ldsmm    1/1     Running     0          17h     10.97.20.159   qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-milvus-rootcoord-8695448964-qmmcq    1/1     Running     1          17h     10.97.14.58    qa-node011.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-minio-0                              1/1     Running     0          17h     10.97.19.159   qa-node016.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-minio-1                              1/1     Running     0          17h     10.97.12.173   qa-node015.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-minio-2                              1/1     Running     0          17h     10.97.12.177   qa-node015.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-minio-3                              1/1     Running     0          17h     10.97.19.163   qa-node016.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-pulsar-bookie-0                      1/1     Running     0          17h     10.97.12.172   qa-node015.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-pulsar-bookie-1                      1/1     Running     0          17h     10.97.19.162   qa-node016.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-pulsar-bookie-2                      1/1     Running     0          17h     10.97.12.178   qa-node015.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-pulsar-bookie-init-bdr7x             0/1     Completed   0          17h     10.97.5.137    qa-node003.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-pulsar-broker-0                      1/1     Running     0          17h     10.97.5.138    qa-node003.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-pulsar-proxy-0                       1/1     Running     0          17h     10.97.10.231   qa-node008.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-pulsar-pulsar-init-cswdh             0/1     Completed   0          17h     10.97.5.136    qa-node003.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-pulsar-recovery-0                    1/1     Running     0          17h     10.97.18.74    qa-node017.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-pulsar-zookeeper-0                   1/1     Running     0          17h     10.97.5.140    qa-node003.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-pulsar-zookeeper-1                   1/1     Running     0          17h     10.97.10.234   qa-node008.zilliz.local   <none>           <none>
test-etcd-no-clean-k646h-1-pulsar-zookeeper-2                   1/1     Running     0          17h     10.97.10.236   qa-node008.zilliz.local   <none>           <none>
2022-07-01 03:41:50,674] [    INFO] -  (locust.stats_logger:733)
[2022-07-01 03:41:50,677] [   ERROR] - RPC error: [delete], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-07-01 03:36:50.512867', 'RPC error': '2022-07-01 03:41:50.677901'}> (pymilvus.decorators:94)
[2022-07-01 03:41:50,685] [ WARNING] - [93mRetry [describe_collection] No.1 in 0.2s, retry reason: <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>[0m (pymilvus.decorators:68)
[2022-07-01 03:41:50,687] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-01 03:36:50.513509', 'RPC error': '2022-07-01 03:41:50.687140'}> (pymilvus.decorators:94)
[2022-07-01 03:41:50,689] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-01 03:36:50.513441', 'RPC error': '2022-07-01 03:41:50.689364'}> (pymilvus.decorators:94)
[2022-07-01 03:41:50,691] [ WARNING] - [93mRetry [describe_collection] No.1 in 0.2s, retry reason: <_MultiThreadedRendezvous: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>[0m (pymilvus.decorators:68)
[2022-07-01 03:41:50,695] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-01 03:36:50.514068', 'RPC error': '2022-07-01 03:41:50.695225'}> (pymilvus.decorators:94)
[2022-07-01 03:41:50,697] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-01 03:36:50.513944', 'RPC error': '2022-07-01 03:41:50.697039'}> (pymilvus.decorators:94)
[2022-07-01 03:41:50,698] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-01 03:36:50.514508', 'RPC error': '2022-07-01 03:41:50.698161'}> (pymilvus.decorators:94)
[2022-07-01 03:41:50,700] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-01 03:36:50.514443', 'RPC error': '2022-07-01 03:41:50.700142'}> (pymilvus.decorators:94)
[2022-07-01 03:41:50,701] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-07-01 03:36:50.514939', 'RPC error': '2022-07-01 03:41:50.701460'}> (pymilvus.decorators:94)
[2022-07-01 03:41:50,705] [   ERROR] - RPC error: [bulk_insert], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)>, <Time:{'RPC start': '2022-07-01 03:36:50.514903', 'RPC error': '2022-07-01 03:41:50.705194'}> (pymilvus.decorators:94)
[2022-07-01 03:41:50,707] [   ERROR] - <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 300s)> (milvus_benchmark.client:169)
[2022-07-01 03:42:50,717] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656646970.711904602","description":"Error received from peer ipv4:10.96.80.134:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-07-01 03:41:50.511150', 'RPC error': '2022-07-01 03:42:50.717379'}> (pymilvus.decorators:94)
[2022-07-01 03:42:50,737] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656646970.712885895","description":"Error received from peer ipv4:10.96.80.134:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-07-01 03:41:50.660082', 'RPC error': '2022-07-01 03:42:50.737539'}> (pymilvus.decorators:94)
[2022-07-01 03:47:50,718] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-01 03:41:50.711836', 'RPC error': '2022-07-01 03:47:50.718787'}> (pymilvus.decorators:94)
[2022-07-01 03:47:50,883] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-01 03:41:50.711784', 'RPC error': '2022-07-01 03:47:50.883040'}> (pymilvus.decorators:94)
[2022-07-01 03:47:50,884] [   ERROR] - RPC error: [describe_collection], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-01 03:41:50.712126', 'RPC error': '2022-07-01 03:47:50.884778'}> (pymilvus.decorators:94)
[2022-07-01 03:47:50,886] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=rpc deadline exceeded: Retry timeout: 60s)>, <Time:{'RPC start': '2022-07-01 03:41:50.712075', 'RPC error': '2022-07-01 03:47:50.886357'}> (pymilvus.decorators:94)
[2022-07-01 03:47:50,888] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656647030.714960614","description":"Error received from peer ipv4:10.96.80.134:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-07-01 03:41:50.660611', 'RPC error': '2022-07-01 03:47:50.888308'}> (pymilvus.decorators:94)
[2022-07-01 03:47:50,889] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656647030.715049843","description":"Error received from peer ipv4:10.96.80.134:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-07-01 03:41:50.660375', 'RPC error': '2022-07-01 03:47:50.889671'}> (pymilvus.decorators:94)
[2022-07-01 03:47:50,891] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656647030.715904910","description":"Error received from peer ipv4:10.96.80.134:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-07-01 03:41:50.676422', 'RPC error': '2022-07-01 03:47:50.891274'}> (pymilvus.decorators:94)
[2022-07-01 03:47:50,892] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656647030.715936485","description":"Error received from peer ipv4:10.96.80.134:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-07-01 03:41:50.661114', 'RPC error': '2022-07-01 03:47:50.892749'}> (pymilvus.decorators:94)

jingkl avatar Jul 01 '22 06:07 jingkl

argo test-etcd-no-clean-xv4dw-1 server-configmap server-cluster-dn2c10m-in8c32m-qn8c64m-compaction client-configmap client-random-locust-insert-delete-18h 2.1.0-20220704-f6ce0559 pymilvus 2.1.0dev87

server:

test-etcd-no-clean-xv4dw-1-0                                    1/1     Running     0          20m     10.97.17.72    qa-node014.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-1                                    1/1     Running     0          20m     10.97.16.196   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-2                                    1/1     Running     0          20m     10.97.17.73    qa-node014.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-milvus-datacoord-84576579f9-b6tjf    1/1     Running     0          20m     10.97.5.183    qa-node003.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-milvus-datanode-7d995457b8-z9ngh     1/1     Running     1          20m     10.97.20.151   qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-milvus-indexcoord-546d7986-zqwth     1/1     Running     1          20m     10.97.3.150    qa-node001.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-milvus-indexnode-646955f47c-fbtjg    1/1     Running     0          20m     10.97.16.194   qa-node013.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-milvus-proxy-74645649d-tqbzp         1/1     Running     1          20m     10.97.3.148    qa-node001.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-milvus-querycoord-7c6cbc7c5c-jqbw5   1/1     Running     1          20m     10.97.3.155    qa-node001.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-milvus-querynode-5d487d676c-gtdqp    1/1     Running     0          20m     10.97.19.195   qa-node016.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-milvus-rootcoord-6b87b58bcf-ll7l4    1/1     Running     0          20m     10.97.18.193   qa-node017.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-minio-0                              1/1     Running     0          20m     10.97.19.199   qa-node016.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-minio-1                              1/1     Running     0          20m     10.97.20.155   qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-minio-2                              1/1     Running     0          20m     10.97.12.158   qa-node015.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-minio-3                              1/1     Running     0          20m     10.97.12.157   qa-node015.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-pulsar-bookie-0                      1/1     Running     0          20m     10.97.12.154   qa-node015.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-pulsar-bookie-1                      1/1     Running     0          20m     10.97.19.200   qa-node016.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-pulsar-bookie-2                      1/1     Running     0          20m     10.97.20.156   qa-node018.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-pulsar-bookie-init-gqln2             0/1     Completed   0          20m     10.97.5.182    qa-node003.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-pulsar-broker-0                      1/1     Running     0          20m     10.97.18.194   qa-node017.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-pulsar-proxy-0                       1/1     Running     0          20m     10.97.17.69    qa-node014.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-pulsar-pulsar-init-8zfdx             0/1     Completed   0          20m     10.97.5.181    qa-node003.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-pulsar-recovery-0                    1/1     Running     0          20m     10.97.3.159    qa-node001.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-pulsar-zookeeper-0                   1/1     Running     0          20m     10.97.5.185    qa-node003.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-pulsar-zookeeper-1                   1/1     Running     0          19m     10.97.3.164    qa-node001.zilliz.local   <none>           <none>
test-etcd-no-clean-xv4dw-1-pulsar-zookeeper-2                   1/1     Running     0          19m     10.97.5.187    qa-node003.zilliz.local   <none>           <none>

client log:

[2022-07-04 03:25:39,947] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656904919.335897832","description":"Error received from peer ipv4:10.96.233.191:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-07-04 03:20:45.915823', 'RPC error': '2022-07-04 03:25:39.947071'}> (pymilvus.decorators:94)
[2022-07-04 03:25:39,947] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656904919.337833012","description":"Error received from peer ipv4:10.96.233.191:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-07-04 03:20:45.916123', 'RPC error': '2022-07-04 03:25:39.947421'}> (pymilvus.decorators:94)
[2022-07-04 03:25:39,947] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656904919.338824105","description":"Error received from peer ipv4:10.96.233.191:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-07-04 03:20:45.916000', 'RPC error': '2022-07-04 03:25:39.947637'}> (pymilvus.decorators:94)
[2022-07-04 03:25:39,947] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=<_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1656904919.338842114","description":"Error received from peer ipv4:10.96.233.191:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2022-07-04 03:20:45.916246', 'RPC error': '2022-07-04 03:25:39.947830'}> (pymilvus.decorators:94)

jingkl avatar Jul 04 '22 06:07 jingkl

It is found that if tbb::concurrent_unordered_multimap has a large number of duplicate keys, equal_range will be very slow, resulting in query timeout

xige-16 avatar Jul 08 '22 14:07 xige-16

Please verify again @jingkl /assign @jingkl /unassign @czs007 /unassign @sunby

xige-16 avatar Aug 11 '22 08:08 xige-16

server-instance fouram-tag-no-clean-st2sf-1 server-configmap server-single-32c128m-compaction client-configmap client-random-locust-insert-delete-35h pymilvus 2.1.1dev3 milvus 2.1.0-20220809-0e4dc112

)>, <Time:{'RPC start': '2022-08-18 14:12:35.009982', 'RPC error': '2022-08-18 14:13:05.052359'}> (pymilvus.decorators:95)
[2022-08-18 14:13:05,052] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=fail to search on all shard leaders, err=All attempts results:
attempt #1:fail to get shard leaders from QueryCoord: no replica available
attempt #2:fail to get shard leaders from QueryCoord: no replica available
attempt #3:fail to get shard leaders from QueryCoord: no replica available
attempt #4:fail to get shard leaders from QueryCoord: no replica available
attempt #5:fail to get shard leaders from QueryCoord: no replica available
attempt #6:fail to get shard leaders from QueryCoord: no replica available
attempt #7:fail to get shard leaders from QueryCoord: no replica available
attempt #8:context deadline exceeded
)>, <Time:{'RPC start': '2022-08-18 14:12:35.010413', 'RPC error': '2022-08-18 14:13:05.052534'}> (pymilvus.decorators:95)
[2022-08-18 14:13:05,052] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=fail to search on all shard leaders, err=All attempts results:
attempt #1:fail to get shard leaders from QueryCoord: no replica available
attempt #2:fail to get shard leaders from QueryCoord: no replica available
attempt #3:fail to get shard leaders from QueryCoord: no replica available
attempt #4:fail to get shard leaders from QueryCoord: no replica available
attempt #5:fail to get shard leaders from QueryCoord: no replica available
attempt #6:fail to get shard leaders from QueryCoord: no replica available
attempt #7:fail to get shard leaders from QueryCoord: no replica available
attempt #8:context deadline exceeded
)>, <Time:{'RPC start': '2022-08-18 14:12:35.010326', 'RPC error': '2022-08-18 14:13:05.052677'}> (pymilvus.decorators:95)
[2022-08-18 14:13:15,069] [   DEBUG] - Milvus delete run in 20.0246s (milvus_benchmark.client:56)
[2022-08-18 14:13:15,080] [   DEBUG] - Milvus load_collection run in 30.0584s (milvus_benchmark.client:56)
[2022-08-18 14:13:15,080] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=fail to search on all shard leaders, err=All attempts results:
attempt #1:fail to get shard leaders from QueryCoord: no replica available
attempt #2:fail to get shard leaders from QueryCoord: no replica available
attempt #3:fail to get shard leaders from QueryCoord: no replica available
attempt #4:fail to get shard leaders from QueryCoord: no replica available
attempt #5:fail to get shard leaders from QueryCoord: no replica available
attempt #6:fail to get shard leaders from QueryCoord: no replica available
attempt #7:fail to get shard leaders from QueryCoord: no replica available
attempt #8:context deadline exceeded
)>, <Time:{'RPC start': '2022-08-18 14:12:45.021720', 'RPC error': '2022-08-18 14:13:15.080835'}> (pymilvus.decorators:95)
[2022-08-18 14:13:15,081] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=fail to search on all shard leaders, err=All attempts results:
attempt #1:fail to get shard leaders from QueryCoord: no replica available
attempt #2:fail to get shard leaders from QueryCoord: no replica available
attempt #3:fail to get shard leaders from QueryCoord: no replica available
attempt #4:fail to get shard leaders from QueryCoord: no replica available
attempt #5:fail to get shard leaders from QueryCoord: no replica available
attempt #6:fail to get shard leaders from QueryCoord: no replica available
attempt #7:fail to get shard leaders from QueryCoord: no replica available
attempt #8:context deadline exceeded
)>, <Time:{'RPC start': '2022-08-18 14:12:45.021965', 'RPC error': '2022-08-18 14:13:15.081076'}> (pymilvus.decorators:95)
[2022-08-18 14:13:15,081] [   ERROR] - RPC error: [search], <MilvusException: (code=1, message=fail to search on all shard leaders, err=All attempts results:
attempt #1:fail to get shard leaders from QueryCoord: no replica available
attempt #2:fail to get shard leaders from QueryCoord: no replica available
attempt #3:fail to get shard leaders from QueryCoord: no replica available
attempt #4:fail to get shard leaders from QueryCoord: no replica available
attempt #5:fail to get shard leaders from QueryCoord: no replica available
attempt #6:fail to get shard leaders from QueryCoord: no replica available
attempt #7:fail to get shard leaders from QueryCoord: no replica available
attempt #8:context deadline exceeded
)>, <Time:{'RPC start': '2022-08-18 14:12:45.021847', 'RPC error': '2022-08-18 14:13:15.081275'}> (pymilvus.decorators:95)
[2022-08-18 14:13:15,081] [   ERROR] - RPC error: [query], <MilvusException: (code=1, message=fail to search on all shard leaders, err=All attempts results:
attempt #1:fail to get shard leaders from QueryCoord: no replica available
attempt #2:fail to get shard leaders from QueryCoord: no replica available
attempt #3:fail to get shard leaders from QueryCoord: no replica available
attempt #4:fail to get shard leaders from QueryCoord: no replica available
attempt #5:fail to get shard leaders from QueryCoord: no replica available
attempt #6:fail to get shard leaders from QueryCoord: no replica available
attempt #7:fail to get shard leaders from QueryCoord: no replica available
attempt #8:context deadline exceeded

server:

fouram-tag-no-clean-st2sf-1-etcd-0                               1/1     Running     0             43h   10.104.6.82    4am-node13   <none>           <none>
fouram-tag-no-clean-st2sf-1-milvus-standalone-546c787d9-xzx6b    1/1     Running     1 (23h ago)   43h   10.104.5.197   4am-node12   <none>           <none>
fouram-tag-no-clean-st2sf-1-minio-7766b7b5df-qshmk               1/1     Running     0             43h   10.104.6.81    4am-node13   <none>           <none>

standalone: 截屏2022-08-19 10 50 55

jingkl avatar Aug 19 '22 02:08 jingkl

server-configmap server-single-8c32m client-configmap client-random-locust-hnsw-compaction-search-5h image: 2.2.0-20221130-c5f215da pymilvus:2.2.0dev72

This problem no longer arises

jingkl avatar Dec 01 '22 02:12 jingkl