milvus
milvus copied to clipboard
[Bug]: Search failed with error message `ShardCluster for xxx replicaID 434611346741395478 is no available` after minio pod kill chaos test
Is there an existing issue for this?
- [X] I have searched the existing issues
Environment
- Milvus version:master-20220715-f0846fb7
- Deployment mode(standalone or cluster):cluster
- SDK version(e.g. pymilvus v2.0.0rc2):pymilvus==2.1.0.dev99
- OS(Ubuntu or CentOS):
- CPU/Memory:
- GPU:
- Others:
Current Behavior
search raise error when running test_e2e.py after minio pod kill chaos test
[2022-07-15 18:57:01 - ERROR - pymilvus.decorators]: RPC error: [search], <MilvusException: (code=1, message=fail to search on all shard leaders, err=fail to Search, QueryNode ID=1, reason=ShardCluster for by-dev-rootcoord-dml_150_434611525181243393v0 replicaID 434611346741395478 is no available)>, <Time:{'RPC start': '2022-07-15 18:57:01.259014', 'RPC error': '2022-07-15 18:57:01.379640'}> (decorators.py:94)
[2022-07-15 18:57:01 - ERROR - ci_test]: Traceback (most recent call last):
File "/home/runner/work/milvus/milvus/tests/python_client/utils/api_request.py", line 26, in inner_wrapper
res = func(*args, **_kwargs)
File "/home/runner/work/milvus/milvus/tests/python_client/utils/api_request.py", line 57, in api_request
return func(*arg, **kwargs)
File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/pymilvus/orm/collection.py", line 712, in search
res = conn.search(self._name, data, anns_field, param, limit, expr,
File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/pymilvus/decorators.py", line 95, in handler
raise e
File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/pymilvus/decorators.py", line 91, in handler
return func(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/pymilvus/decorators.py", line 73, in handler
raise e
File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/pymilvus/decorators.py", line 48, in handler
return func(self, *args, **kwargs)
File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 446, in search
return self._execute_search_requests(requests, timeout, **_kwargs)
File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 410, in _execute_search_requests
raise pre_err
File "/opt/hostedtoolcache/Python/3.8.13/x64/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 401, in _execute_search_requests
raise MilvusException(response.status.error_code, response.status.reason)
pymilvus.exceptions.MilvusException: <MilvusException: (code=1, message=fail to search on all shard leaders, err=fail to Search, QueryNode ID=1, reason=ShardCluster for by-dev-rootcoord-dml_150_434611525181243393v0 replicaID 434611346741395478 is no available)>
(api_request.py:39)
[2022-07-15 18:57:01 - ERROR - ci_test]: (api_response) : <MilvusException: (code=1, message=fail to search on all shard leaders, err=fail to Search, QueryNode ID=1, reason=ShardCluster for by-dev-rootcoord-dml_150_434611525181243393v0 replicaID 434611346741395478 is no available)> (api_request.py:40)
Expected Behavior
all test cases passed
Steps To Reproduce
see https://github.com/milvus-io/milvus/runs/7363254556?check_suite_focus=true
Milvus Log
failed job: https://github.com/milvus-io/milvus/runs/7363254556?check_suite_focus=true log: https://github.com/milvus-io/milvus/suites/7378579518/artifacts/299999737
Anything else?
It spent lots of time (20s) to load
[2022-07-15 18:57:01 - INFO - ci_test]: [test][2022-07-15T18:56:40Z] [20.49685979s] e2e__O4gvWuXb load -> None (wrapper.py:30)
In proxy log,there is also a warning log
[2022/07/15 18:57:01.271 +00:00] [WARN] [proxy/task_search.go:485] ["collection not fully loaded, search on these partitions"] [collection=e2e__O4gvWuXb] [collectionID=434611525181243393] [partitionIDs="[434611525181243394]"]
/assign @soothing-rain /unassign
/unassign /assign @jiaoew1991
Not reproduced anymore, so close it!