milvus icon indicating copy to clipboard operation
milvus copied to clipboard

[Bug]: [perf]Milvus search reports an error in the search parameter nprobe of 512

Open jingkl opened this issue 1 year ago • 3 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Environment

- Milvus version:master-20230412-43a9e175
- Deployment mode(standalone or cluster):standalone
- MQ type(rocksmq, pulsar or kafka):    rocksmq
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

release_name_prefix: fouramf-recall-master-1681268400

server:


fouramf-recall-68400-1-52-9831-etcd-0                             1/1     Running     0                86m     10.104.5.94     4am-node12   <none>           <none>
fouramf-recall-68400-1-52-9831-milvus-standalone-7d76c48bbt4tqq   1/1     Running     0                86m     10.104.14.197   4am-node18   <none>           <none>
fouramf-recall-68400-1-52-9831-minio-89489447d-mlc57              1/1     Running     0                86m     10.104.5.92     4am-node12   <none>           <none>

client log:

[2023-04-12 13:50:49,562 -  INFO - fouram]: [Base] Params of search: nq:10000, anns_field:float_vector, param:{'metric_type': 'IP', 'params': {'nprobe': 512}}, limit:10, expr:"None", kwargs:{} (base.py:338)
[2023-04-12 13:55:49,802 - ERROR - fouram]: RPC error: [search], <MilvusException: (code=1, message=<_InactiveRpcError of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1681307749.801421885","description":"Error received from peer ipv4:10.255.126.34:19530","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>, <Time:{'RPC start': '2023-04-12 13:50:49.565246', 'RPC error': '2023-04-12 13:55:49.802027'}> (decorators.py:108)
[2023-04-12 13:55:49,804 - ERROR - fouram]: Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 50, in handler
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/client/grpc_handler.py", line 518, in search
    return self._execute_search_requests(requests, timeout, round_decimal=round_decimal, auto_id=auto_id, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/client/grpc_handler.py", line 487, in _execute_search_requests
    raise pre_err
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/client/grpc_handler.py", line 475, in _execute_search_requests
    response = self._stub.Search(request, timeout=timeout)
  File "/usr/local/lib/python3.8/dist-packages/grpc/_channel.py", line 946, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/usr/local/lib/python3.8/dist-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
    raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1681307749.801421885","description":"Error received from peer ipv4:10.255.126.34:19530","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/src/fouram/client/util/api_request.py", line 33, in inner_wrapper
    res = func(*args, **kwargs)
  File "/src/fouram/client/util/api_request.py", line 70, in api_request
    return func(*arg, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/orm/collection.py", line 660, in search
    res = conn.search(self._name, data, anns_field, param, limit, expr,
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 109, in handler
    raise e
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 105, in handler
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 136, in handler
    ret = func(self, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 58, in handler
    raise MilvusException(message=str(e)) from e
pymilvus.exceptions.MilvusException: <MilvusException: (code=1, message=<_InactiveRpcError of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1681307749.801421885","description":"Error received from peer ipv4:10.255.126.34:19530","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)>
 (api_request.py:48)
[2023-04-12 13:55:49,805 - ERROR - fouram]: (api_response) : <MilvusException: (code=1, message=<_InactiveRpcError of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1681307749.801421885","description":"Error received from peer ipv4:10.255.126.34:19530","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)> (api_request.py:49)
[2023-04-12 13:55:49,805 - ERROR - fouram]: [CheckFunc] search request check failed, response:<MilvusException: (code=1, message=<_InactiveRpcError of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1681307749.801421885","description":"Error received from peer ipv4:10.255.126.34:19530","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"Deadline Exceeded","grpc_status":4}"
>)> (func_check.py:49)
[2023-04-12 13:55:49,805 - ERROR - fouram]: [AccCases] Search raise error:  (accuracy_cases.py:197)

Expected Behavior

No response

Steps To Reproduce

1. create a collection 
        2. insert training dataset
        3. flush collection
        4. clean index and build ivfflat index
        5. load collection
        6. search with different parameters

Milvus Log

No response

Anything else?

[2023-04-12 12:33:11,947 - INFO - fouram]: [check_params] scene_recall required params: {'dataset_params': {'dim': 200, 'dataset_name': 'glove-200-angular', 'ni_per': 10000}, 'collection_params': {'other_fields': []}, 'load_params': {'replica_number': 1}, 'search_params': {'top_k': [10], 'nq': [10000], 'search_param': {'nprobe': [1, 2, 4, 8, 16, 32, 64, 128, 256, 512]}}, 'index_params': {'index_type': 'IVF_FLAT', 'index_param': {'nlist': 1024}}} (params_check.py:31)

jingkl avatar Apr 14 '23 08:04 jingkl

@jingkl is it a timeout issue? /assign @jiaoew1991 /unasign

yanliang567 avatar Apr 14 '23 09:04 yanliang567

@yanliang567 It's a timeout problem, but it recurs every time

jingkl avatar Apr 14 '23 10:04 jingkl

I don't think we should do nprobe 512 on nlist 1024, just set it to low priority. If you have to, then user has to tune the timeout?

xiaofan-luan avatar Apr 14 '23 17:04 xiaofan-luan

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

stale[bot] avatar May 15 '23 14:05 stale[bot]