milvus icon indicating copy to clipboard operation
milvus copied to clipboard

[Bug]: Search failed with error message `ShardCluster for xxx replicaID 434611346741395478 is no available` after minio pod kill chaos test

Open zhuwenxing opened this issue 2 years ago • 1 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Environment

- Milvus version:master-20220715-f0846fb7
- Deployment mode(standalone or cluster):cluster
- SDK version(e.g. pymilvus v2.0.0rc2):pymilvus==2.1.0.dev99
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

search raise error when running test_e2e.py after minio pod kill chaos test

[2022-07-15 18:57:01 - ERROR - pymilvus.decorators]: RPC error: [search], <MilvusException: (code=1, message=fail to search on all shard leaders, err=fail to Search, QueryNode ID=1, reason=ShardCluster for by-dev-rootcoord-dml_150_434611525181243393v0 replicaID 434611346741395478 is no available)>, <Time:{'RPC start': '2022-07-15 18:57:01.259014', 'RPC error': '2022-07-15 18:57:01.379640'}> (decorators.py:94)
[2022-07-15 18:57:01 - ERROR - ci_test]: Traceback (most recent call last):
  File "/home/runner/work/milvus/milvus/tests/python_client/utils/api_request.py", line 26, in inner_wrapper
    res = func(*args, **_kwargs)
  File "/home/runner/work/milvus/milvus/tests/python_client/utils/api_request.py", line 57, in api_request
    return func(*arg, **kwargs)
  File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/pymilvus/orm/collection.py", line 712, in search
    res = conn.search(self._name, data, anns_field, param, limit, expr,
  File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/pymilvus/decorators.py", line 95, in handler
    raise e
  File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/pymilvus/decorators.py", line 91, in handler
    return func(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/pymilvus/decorators.py", line 73, in handler
    raise e
  File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/pymilvus/decorators.py", line 48, in handler
    return func(self, *args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 446, in search
    return self._execute_search_requests(requests, timeout, **_kwargs)
  File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 410, in _execute_search_requests
    raise pre_err
  File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 401, in _execute_search_requests
    raise MilvusException(response.status.error_code, response.status.reason)
pymilvus.exceptions.MilvusException: <MilvusException: (code=1, message=fail to search on all shard leaders, err=fail to Search, QueryNode ID=1, reason=ShardCluster for by-dev-rootcoord-dml_150_434611525181243393v0 replicaID 434611346741395478 is no available)>
 (api_request.py:39)
[2022-07-15 18:57:01 - ERROR - ci_test]: (api_response) : <MilvusException: (code=1, message=fail to search on all shard leaders, err=fail to Search, QueryNode ID=1, reason=ShardCluster for by-dev-rootcoord-dml_150_434611525181243393v0 replicaID 434611346741395478 is no available)> (api_request.py:40)

Expected Behavior

all test cases passed

Steps To Reproduce

see https://github.com/milvus-io/milvus/runs/7363254556?check_suite_focus=true

Milvus Log

failed job: https://github.com/milvus-io/milvus/runs/7363254556?check_suite_focus=true log: https://github.com/milvus-io/milvus/suites/7378579518/artifacts/299999737

Anything else?

It spent lots of time (20s) to load

[2022-07-15 18:57:01 - INFO - ci_test]: [test][2022-07-15T18:56:40Z] [20.49685979s] e2e__O4gvWuXb load -> None (wrapper.py:30)

In proxy log,there is also a warning log

[2022/07/15 18:57:01.271 +00:00] [WARN] [proxy/task_search.go:485] ["collection not fully loaded, search on these partitions"] [collection=e2e__O4gvWuXb] [collectionID=434611525181243393] [partitionIDs="[434611525181243394]"]

zhuwenxing avatar Jul 18 '22 02:07 zhuwenxing

/assign @soothing-rain /unassign

yanliang567 avatar Jul 18 '22 02:07 yanliang567

/unassign /assign @jiaoew1991

soothing-rain avatar Aug 25 '22 03:08 soothing-rain

Not reproduced anymore, so close it!

zhuwenxing avatar Sep 21 '22 02:09 zhuwenxing