milvus icon indicating copy to clipboard operation
milvus copied to clipboard

[Bug]: [benchmark][cluster] Low CPU usage and high latency of hybrid_search in concurrent hybrid_search scenario

Open wangting0128 opened this issue 11 months ago • 22 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Environment

- Milvus version:master-20240226-befe0e21-amd64
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka):pulsar    
- SDK version(e.g. pymilvus v2.0.0rc2):2.4.0rc19
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

argo task: multi-vector-corn-mfmgj test case name: test_hybrid_search_locust_shard1_dql_indexes_replicas_cluster

server:

NAME                                                              READY   STATUS                            RESTARTS         AGE     IP              NODE         NOMINATED NODE   READINESS GATES
multi-vector-corn-mfmgj-11-etcd-0                                 1/1     Running                           0                12h     10.104.20.65    4am-node22   <none>           <none>
multi-vector-corn-mfmgj-11-etcd-1                                 1/1     Running                           0                12h     10.104.34.232   4am-node37   <none>           <none>
multi-vector-corn-mfmgj-11-etcd-2                                 1/1     Running                           0                12h     10.104.30.216   4am-node38   <none>           <none>
multi-vector-corn-mfmgj-11-milvus-datacoord-d86db89c6-gmxsk       1/1     Running                           0                12h     10.104.1.93     4am-node10   <none>           <none>
multi-vector-corn-mfmgj-11-milvus-datanode-67dcc88689-6x8tv       1/1     Running                           1 (12h ago)      12h     10.104.34.224   4am-node37   <none>           <none>
multi-vector-corn-mfmgj-11-milvus-indexcoord-6ccd668df-s87gv      1/1     Running                           0                12h     10.104.13.228   4am-node16   <none>           <none>
multi-vector-corn-mfmgj-11-milvus-indexnode-67756c8444-8cclm      1/1     Running                           0                12h     10.104.26.162   4am-node32   <none>           <none>
multi-vector-corn-mfmgj-11-milvus-proxy-7b9c596cb7-267dt          1/1     Running                           1 (12h ago)      12h     10.104.1.92     4am-node10   <none>           <none>
multi-vector-corn-mfmgj-11-milvus-querycoord-7f84b66654-jrck4     1/1     Running                           1 (12h ago)      12h     10.104.13.227   4am-node16   <none>           <none>
multi-vector-corn-mfmgj-11-milvus-querynode-95c8b94f4-25trp       1/1     Running                           0                12h     10.104.32.108   4am-node39   <none>           <none>
multi-vector-corn-mfmgj-11-milvus-querynode-95c8b94f4-6wrw9       1/1     Running                           0                12h     10.104.13.229   4am-node16   <none>           <none>
multi-vector-corn-mfmgj-11-milvus-querynode-95c8b94f4-tm282       1/1     Running                           0                12h     10.104.16.113   4am-node21   <none>           <none>
multi-vector-corn-mfmgj-11-milvus-querynode-95c8b94f4-w4cmc       1/1     Running                           0                12h     10.104.12.190   4am-node17   <none>           <none>
multi-vector-corn-mfmgj-11-milvus-rootcoord-5b895f49c6-6n42r      1/1     Running                           1 (12h ago)      12h     10.104.26.160   4am-node32   <none>           <none>
multi-vector-corn-mfmgj-11-minio-0                                1/1     Running                           0                12h     10.104.15.112   4am-node20   <none>           <none>
multi-vector-corn-mfmgj-11-minio-1                                1/1     Running                           0                12h     10.104.28.155   4am-node33   <none>           <none>
multi-vector-corn-mfmgj-11-minio-2                                1/1     Running                           0                12h     10.104.29.81    4am-node35   <none>           <none>
multi-vector-corn-mfmgj-11-minio-3                                1/1     Running                           0                12h     10.104.30.215   4am-node38   <none>           <none>
multi-vector-corn-mfmgj-11-pulsar-bookie-0                        1/1     Running                           0                12h     10.104.28.156   4am-node33   <none>           <none>
multi-vector-corn-mfmgj-11-pulsar-bookie-1                        1/1     Running                           0                12h     10.104.15.117   4am-node20   <none>           <none>
multi-vector-corn-mfmgj-11-pulsar-bookie-2                        1/1     Running                           0                12h     10.104.25.160   4am-node30   <none>           <none>
multi-vector-corn-mfmgj-11-pulsar-bookie-init-lgxgl               0/1     Completed                         0                12h     10.104.28.146   4am-node33   <none>           <none>
multi-vector-corn-mfmgj-11-pulsar-broker-0                        1/1     Running                           0                12h     10.104.20.57    4am-node22   <none>           <none>
multi-vector-corn-mfmgj-11-pulsar-proxy-0                         1/1     Running                           0                12h     10.104.1.95     4am-node10   <none>           <none>
multi-vector-corn-mfmgj-11-pulsar-pulsar-init-mnqwl               0/1     Completed                         0                12h     10.104.13.226   4am-node16   <none>           <none>
multi-vector-corn-mfmgj-11-pulsar-recovery-0                      1/1     Running                           0                12h     10.104.1.94     4am-node10   <none>           <none>
multi-vector-corn-mfmgj-11-pulsar-zookeeper-0                     1/1     Running                           0                12h     10.104.29.80    4am-node35   <none>           <none>
multi-vector-corn-mfmgj-11-pulsar-zookeeper-1                     1/1     Running                           0                12h     10.104.28.167   4am-node33   <none>           <none>
multi-vector-corn-mfmgj-11-pulsar-zookeeper-2                     1/1     Running                           0                12h     10.104.26.169   4am-node32   <none>           <none> 

image image 截屏2024-02-27 15 35 49

截屏2024-02-27 15 36 39 截屏2024-02-27 15 37 29

client pod name: multi-vector-corn-mfmgj-2255719034 client log: client.log

[2024-02-26 16:10:04,089 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-02-26 16:10:04,089 -  INFO - fouram]:          Aggregated                                                                      6144     0(0.00%) |  24199    1926  106004  20000 |    0.30        0.00 (stats.py:789)
[2024-02-26 16:10:04,089 -  INFO - fouram]:  (stats.py:790)
[2024-02-26 16:10:04,090 -  INFO - fouram]: Response time percentiles (approximated) (stats.py:819)
[2024-02-26 16:10:04,090 -  INFO - fouram]: Type     Name                                                                                  50%    66%    75%    80%    90%    95%    98%    99%  99.9% 99.99%   100% # reqs (stats.py:819)
[2024-02-26 16:10:04,090 -  INFO - fouram]: --------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------ (stats.py:819)
[2024-02-26 16:10:04,090 -  INFO - fouram]: grpc     hybrid_search                                                                       20000  25000  30000  33000  43000  55000  68000  78000  97000 106000 106000   6144 (stats.py:819)
[2024-02-26 16:10:04,090 -  INFO - fouram]: --------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------ (stats.py:819)
[2024-02-26 16:10:04,090 -  INFO - fouram]:          Aggregated                                                                          20000  25000  30000  33000  43000  55000  68000  78000  97000 106000 106000   6144 (stats.py:819)
[2024-02-26 16:10:04,090 -  INFO - fouram]:  (stats.py:820)
[2024-02-26 16:10:14,637 - ERROR - fouram]: grpc RpcError: [hybrid_search], <_InactiveRpcError: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2024-02-26 16:08:14.635502', 'gRPC error': '2024-02-26 16:10:14.637707'}> (decorators.py:145)
[2024-02-26 16:10:14,639 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2024-02-26T16:10:14.637196947+00:00", grpc_status:4, grpc_message:"Deadline Exceeded"}"
>, [requestId: 3f085ee2-d4c1-11ee-b173-22f6251fc128] (api_request.py:57)
[2024-02-26 16:10:14,639 - ERROR - fouram]: [CheckFunc] hybrid_search request check failed, response:<_InactiveRpcError of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2024-02-26T16:10:14.637196947+00:00", grpc_status:4, grpc_message:"Deadline Exceeded"}"
> (func_check.py:48)
[2024-02-26 16:10:24,091 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
[2024-02-26 16:10:24,092 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-02-26 16:10:24,092 -  INFO - fouram]: grpc     hybrid_search                                                                   6156     1(0.02%) |  24232    1926  120003  20000 |    0.90        0.00 (stats.py:789)
[2024-02-26 16:10:24,092 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-02-26 16:10:24,092 -  INFO - fouram]:          Aggregated                                                                      6156     1(0.02%) |  24232    1926  120003  20000 |    0.90        0.00 (stats.py:789)
[2024-02-26 16:10:24,092 -  INFO - fouram]:  (stats.py:790)
[2024-02-26 16:10:24,092 -  INFO - fouram]: Response time percentiles (approximated) (stats.py:819)
[2024-02-26 16:10:24,092 -  INFO - fouram]: Type     Name                                                                                  50%    66%    75%    80%    90%    95%    98%    99%  99.9% 99.99%   100% # reqs (stats.py:819)
[2024-02-26 16:10:24,092 -  INFO - fouram]: --------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------ (stats.py:819)
[2024-02-26 16:10:24,092 -  INFO - fouram]: grpc     hybrid_search                                                                       20000  25000  30000  33000  43000  55000  68000  79000  97000 120000 120000   6156 (stats.py:819)
[2024-02-26 16:10:24,092 -  INFO - fouram]: --------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------ (stats.py:819)
[2024-02-26 16:10:24,092 -  INFO - fouram]:          Aggregated                                                                          20000  25000  30000  33000  43000  55000  68000  79000  97000 120000 120000   6156 (stats.py:819)
[2024-02-26 16:10:24,093 -  INFO - fouram]:  (stats.py:820)
[2024-02-26 16:10:41,635 - ERROR - fouram]: grpc RpcError: [hybrid_search], <_InactiveRpcError: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2024-02-26 16:08:41.632207', 'gRPC error': '2024-02-26 16:10:41.635128'}> (decorators.py:145)
[2024-02-26 16:10:41,636 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"Deadline Exceeded", grpc_status:4, created_time:"2024-02-26T16:10:41.634192847+00:00"}"
>, [requestId: 4f1fbe4c-d4c1-11ee-b173-22f6251fc128] (api_request.py:57)
[2024-02-26 16:10:41,636 - ERROR - fouram]: [CheckFunc] hybrid_search request check failed, response:<_InactiveRpcError of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"Deadline Exceeded", grpc_status:4, created_time:"2024-02-26T16:10:41.634192847+00:00"}"
> (func_check.py:48)
[2024-02-26 16:10:41,703 - ERROR - fouram]: grpc RpcError: [hybrid_search], <_InactiveRpcError: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2024-02-26 16:08:41.695219', 'gRPC error': '2024-02-26 16:10:41.703561'}> (decorators.py:145)
[2024-02-26 16:10:41,704 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2024-02-26T16:10:41.697174128+00:00", grpc_status:4, grpc_message:"Deadline Exceeded"}"
>, [requestId: 4f295b96-d4c1-11ee-b173-22f6251fc128] (api_request.py:57)
[2024-02-26 16:10:41,704 - ERROR - fouram]: [CheckFunc] hybrid_search request check failed, response:<_InactiveRpcError of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2024-02-26T16:10:41.697174128+00:00", grpc_status:4, grpc_message:"Deadline Exceeded"}"
> (func_check.py:48)
[2024-02-26 16:10:41,977 - ERROR - fouram]: grpc RpcError: [hybrid_search], <_InactiveRpcError: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2024-02-26 16:08:41.761037', 'gRPC error': '2024-02-26 16:10:41.977427'}> (decorators.py:145)
[2024-02-26 16:10:41,978 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2024-02-26T16:10:41.764178367+00:00", grpc_status:4, grpc_message:"Deadline Exceeded"}"
>, [requestId: 4f33638e-d4c1-11ee-b173-22f6251fc128] (api_request.py:57)
[2024-02-26 16:10:41,978 - ERROR - fouram]: [CheckFunc] hybrid_search request check failed, response:<_InactiveRpcError of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2024-02-26T16:10:41.764178367+00:00", grpc_status:4, grpc_message:"Deadline Exceeded"}"
> (func_check.py:48)
[2024-02-26 16:10:41,978 - ERROR - fouram]: grpc RpcError: [hybrid_search], <_InactiveRpcError: StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded>, <Time:{'RPC start': '2024-02-26 16:08:41.833096', 'gRPC error': '2024-02-26 16:10:41.978770'}> (decorators.py:145)
[2024-02-26 16:10:41,979 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2024-02-26T16:10:41.835168737+00:00", grpc_status:4, grpc_message:"Deadline Exceeded"}"
>, [requestId: 4f3e6496-d4c1-11ee-b173-22f6251fc128] (api_request.py:57)
[2024-02-26 16:10:41,979 - ERROR - fouram]: [CheckFunc] hybrid_search request check failed, response:<_InactiveRpcError of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2024-02-26T16:10:41.835168737+00:00", grpc_status:4, grpc_message:"Deadline Exceeded"}"
> (func_check.py:48)
[2024-02-26 16:10:44,094 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
[2024-02-26 16:10:44,094 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-02-26 16:10:44,094 -  INFO - fouram]: grpc     hybrid_search                                                                   6173     5(0.08%) |  24319    1926  120217  20000 |    0.20        0.00 (stats.py:789)
[2024-02-26 16:10:44,094 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-02-26 16:10:44,094 -  INFO - fouram]:          Aggregated                                                                      6173     5(0.08%) |  24319    1926  120217  20000 |    0.20        0.00 (stats.py:789)
[2024-02-26 16:10:44,094 -  INFO - fouram]:  (stats.py:790)
[2024-02-26 16:10:44,095 -  INFO - fouram]: Response time percentiles (approximated) (stats.py:819)
[2024-02-26 16:10:44,095 -  INFO - fouram]: Type     Name                                                                                  50%    66%    75%    80%    90%    95%    98%    99%  99.9% 99.99%   100% # reqs (stats.py:819)
[2024-02-26 16:10:44,095 -  INFO - fouram]: --------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------ (stats.py:819)

image

Expected Behavior

No response

Steps To Reproduce

concurrent test and calculation of RT and QPS

        :purpose:  `shard_num=1, replica=3, binary_vector & float_vector DQL`
            verify concurrent DQL scenario which has 4 binary_vector fields(BIN_IVF_FLAT, BIN_FLAT) and 60 scalar fields

        :test steps:
            1. create collection with fields:
                'float_vector': 32768dim,
                'binary_vector_1': 32768dim,
                'float_vector_2': 32768dim,
                'binary_vector_3': 32768dim,
                all scalar fields: varchar max_length=10, array max_capacity=7
            2. build indexes:
                HNSW: 'float_vector'
                BIN_IVF_FLAT: 'binary_vector_1',
                DISKANN: 'float_vector_2'
                BIN_FLAT: 'binary_vector_3'
                default_scalar_index: 'int64_1'
                INVERTED: 'id', 'bool_3'
            3. insert 100k data
            4. flush collection
            5. build indexes again using the same params
            6. load collection
                replica: 3
            7. concurrent request:
                - hybrid_search

Milvus Log

No response

Anything else?

test result:

[2024-02-26 19:04:30,997 -  INFO - fouram]: Type     Name                                                                                  50%    66%    75%    80%    90%    95%    98%    99%  99.9% 99.99%   100% # reqs (stats.py:819)
[2024-02-26 19:04:30,997 -  INFO - fouram]: --------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------ (stats.py:819)
[2024-02-26 19:04:30,997 -  INFO - fouram]: grpc     hybrid_search                                                                       21000  26000  30000  33000  44000  55000  70000  79000 103000 120000 120000  14339 (stats.py:819)
[2024-02-26 19:04:30,997 -  INFO - fouram]: --------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------ (stats.py:819)
[2024-02-26 19:04:30,997 -  INFO - fouram]:          Aggregated                                                                          21000  26000  30000  33000  44000  55000  70000  79000 103000 120000 120000  14339 (stats.py:819)
[2024-02-26 19:04:30,997 -  INFO - fouram]:  (stats.py:820)
[2024-02-26 19:04:39,983 -  INFO - fouram]: Print locust final stats. (locust_runner.py:56)
[2024-02-26 19:04:39,983 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
[2024-02-26 19:04:39,983 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-02-26 19:04:39,983 -  INFO - fouram]: grpc     hybrid_search                                                                  14348     5(0.03%) |  24905    1864  120217  21000 |    0.80        0.00 (stats.py:789)
[2024-02-26 19:04:39,983 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-02-26 19:04:39,983 -  INFO - fouram]:          Aggregated                                                                     14348     5(0.03%) |  24905    1864  120217  21000 |    0.80        0.00 (stats.py:789)
[2024-02-26 19:04:39,983 -  INFO - fouram]:  (stats.py:790)
[2024-02-26 19:04:39,988 -  INFO - fouram]: [PerfTemplate] Report data: 
{'server': {'deploy_tool': 'helm',
            'deploy_mode': 'cluster',
            'config_name': 'cluster_8c16m',
            'config': {'queryNode': {'resources': {'limits': {'cpu': '16.0',
                                                              'memory': '64Gi'},
                                                   'requests': {'cpu': '9.0',
                                                                'memory': '33Gi'}},
                                     'replicas': 4},
                       'indexNode': {'resources': {'limits': {'cpu': '8.0',
                                                              'memory': '16Gi'},
                                                   'requests': {'cpu': '5.0',
                                                                'memory': '9Gi'}}},
                       'dataNode': {'resources': {'limits': {'cpu': '8.0',
                                                             'memory': '16Gi'},
                                                  'requests': {'cpu': '5.0',
                                                               'memory': '9Gi'}}},
                       'cluster': {'enabled': True},
                       'pulsar': {},
                       'kafka': {},
                       'minio': {'metrics': {'podMonitor': {'enabled': True}}},
                       'etcd': {'metrics': {'enabled': True,
                                            'podMonitor': {'enabled': True}}},
                       'metrics': {'serviceMonitor': {'enabled': True}},
                       'log': {'level': 'debug'},
                       'image': {'all': {'repository': 'harbor.milvus.io/milvus/milvus',
                                         'tag': 'master-20240226-befe0e21-amd64'}}},
            'host': 'multi-vector-corn-mfmgj-11-milvus.qa-milvus.svc.cluster.local',
            'port': '19530',
            'uri': ''},
 'client': {'test_case_type': 'ConcurrentClientBase',
            'test_case_name': 'test_hybrid_search_locust_shard1_dql_indexes_replicas_cluster',
            'test_case_params': {'dataset_params': {'metric_type': 'L2',
                                                    'dim': 32768,
                                                    'max_length': 10,
                                                    'scalars_index': {'int64_1': {},
                                                                      'id': {'index_type': 'INVERTED'},
                                                                      'bool_3': {'index_type': 'INVERTED'}},
                                                    'vectors_index': {'binary_vector_1': {'index_type': 'BIN_IVF_FLAT',
                                                                                          'index_param': {'nlist': 2048},
                                                                                          'metric_type': 'JACCARD'},
                                                                      'float_vector_2': {'index_type': 'DISKANN',
                                                                                         'index_param': {},
                                                                                         'metric_type': 'L2'},
                                                                      'binary_vector_3': {'index_type': 'BIN_FLAT',
                                                                                          'index_param': {'nlist': 2048},
                                                                                          'metric_type': 'JACCARD'}},
                                                    'scalars_params': {'array_int8_1': {'params': {'max_capacity': 7}},
                                                                       'array_int16_1': {'params': {'max_capacity': 7}},
                                                                       'array_int32_1': {'params': {'max_capacity': 7}},
                                                                       'array_int64_1': {'params': {'max_capacity': 7}},
                                                                       'array_double_1': {'params': {'max_capacity': 7}},
                                                                       'array_float_1': {'params': {'max_capacity': 7}},
                                                                       'array_varchar_1': {'params': {'max_capacity': 7}},
                                                                       'array_bool_1': {'params': {'max_capacity': 7}},
                                                                       'array_int8_2': {'params': {'max_capacity': 7}},
                                                                       'array_int16_2': {'params': {'max_capacity': 7}},
                                                                       'array_int32_2': {'params': {'max_capacity': 7}},
                                                                       'array_int64_2': {'params': {'max_capacity': 7}},
                                                                       'array_double_2': {'params': {'max_capacity': 7}},
                                                                       'array_float_2': {'params': {'max_capacity': 7}},
                                                                       'array_varchar_2': {'params': {'max_capacity': 7}},
                                                                       'array_bool_2': {'params': {'max_capacity': 7}},
                                                                       'array_int8_3': {'params': {'max_capacity': 7}},
                                                                       'array_int16_3': {'params': {'max_capacity': 7}},
                                                                       'array_int32_3': {'params': {'max_capacity': 7}},
                                                                       'array_int64_3': {'params': {'max_capacity': 7}},
                                                                       'array_double_3': {'params': {'max_capacity': 7}},
                                                                       'array_float_3': {'params': {'max_capacity': 7}},
                                                                       'array_varchar_3': {'params': {'max_capacity': 7}},
                                                                       'array_bool_3': {'params': {'max_capacity': 7}}},
                                                    'dataset_name': 'local',
                                                    'dataset_size': 100000,
                                                    'ni_per': 100},
                                 'collection_params': {'other_fields': ['binary_vector_1',
                                                                        'float_vector_2',
                                                                        'binary_vector_3',
                                                                        'int8_1',
                                                                        'int16_1',
                                                                        'int32_1',
                                                                        'int64_1',
                                                                        'double_1',
                                                                        'float_1',
                                                                        'varchar_1',
                                                                        'bool_1',
                                                                        'json_1',
                                                                        'array_int8_1',
                                                                        'array_int16_1',
                                                                        'array_int32_1',
                                                                        'array_int64_1',
                                                                        'array_double_1',
                                                                        'array_float_1',
                                                                        'array_varchar_1',
                                                                        'array_bool_1',
                                                                        'int8_2',
                                                                        'int16_2',
                                                                        'int32_2',
                                                                        'int64_2',
                                                                        'double_2',
                                                                        'float_2',
                                                                        'varchar_2',
                                                                        'bool_2',
                                                                        'json_2',
                                                                        'array_int8_2',
                                                                        'array_int16_2',
                                                                        'array_int32_2',
                                                                        'array_int64_2',
                                                                        'array_double_2',
                                                                        'array_float_2',
                                                                        'array_varchar_2',
                                                                        'array_bool_2',
                                                                        'int8_3',
                                                                        'int16_3',
                                                                        'int32_3',
                                                                        'int64_3',
                                                                        'double_3',
                                                                        'float_3',
                                                                        'varchar_3',
                                                                        'bool_3',
                                                                        'json_3',
                                                                        'array_int8_3',
                                                                        'array_int16_3',
                                                                        'array_int32_3',
                                                                        'array_int64_3',
                                                                        'array_double_3',
                                                                        'array_float_3',
                                                                        'array_varchar_3',
                                                                        'array_bool_3',
                                                                        'varchar_tail_1',
                                                                        'varchar_tail_2',
                                                                        'varchar_tail_3',
                                                                        'varchar_tail_4',
                                                                        'varchar_tail_5',
                                                                        'varchar_tail_6',
                                                                        'varchar_tail_7',
                                                                        'varchar_tail_8'],
                                                       'shards_num': 1},
                                 'load_params': {'replica_number': 3},
                                 'resource_groups_params': {'reset': False},
                                 'database_user_params': {'reset_rbac': False,
                                                          'reset_db': False},
                                 'index_params': {'index_type': 'HNSW',
                                                  'index_param': {'M': 8,
                                                                  'efConstruction': 200}},
                                 'concurrent_params': {'concurrent_number': 20,
                                                       'during_time': '5h',
                                                       'interval': 20,
                                                       'spawn_rate': None},
                                 'concurrent_tasks': [{'type': 'hybrid_search',
                                                       'weight': 1,
                                                       'params': {'nq': 1,
                                                                  'top_k': 1,
                                                                  'reqs': [{'search_param': {'ef': 128},
                                                                            'anns_field': 'float_vector',
                                                                            'expr': '10000 '
                                                                                    '< '
                                                                                    'id '
                                                                                    '< '
                                                                                    '90000',
                                                                            'top_k': 10},
                                                                           {'search_param': {'nprobe': 64},
                                                                            'anns_field': 'binary_vector_1',
                                                                            'expr': 'int64_1 '
                                                                                    '> '
                                                                                    '10000',
                                                                            'top_k': 50},
                                                                           {'search_param': {'search_list': 1000},
                                                                            'anns_field': 'float_vector_2',
                                                                            'expr': 'array_length(array_int8_2) '
                                                                                    '== '
                                                                                    '7',
                                                                            'top_k': 1000},
                                                                           {'search_param': {'nprobe': 16},
                                                                            'anns_field': 'binary_vector_3',
                                                                            'expr': 'bool_3 '
                                                                                    '== '
                                                                                    'True',
                                                                            'top_k': 3000}],
                                                                  'rerank': {'WeightedRanker': [0.9,
                                                                                                0.9,
                                                                                                0.5,
                                                                                                0.5]},
                                                                  'output_fields': ['*'],
                                                                  'ignore_growing': False,
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'timeout': 120,
                                                                  'random_data': True}}]},
            'run_id': 2024022603559836,
            'datetime': '2024-02-26 06:52:35.676013',
            'client_version': '2.4.0'},
 'result': {'test_result': {'index': {'RT': 484.1296,
                                      'binary_vector_1': {'RT': 327.7243},
                                      'float_vector_2': {'RT': 257.0689},
                                      'binary_vector_3': {'RT': 0.5384},
                                      'int64_1': {'RT': 0.5416},
                                      'id': {'RT': 0.5404},
                                      'bool_3': {'RT': 0.5387}},
                            'insert': {'total_time': 1657.4622,
                                       'VPS': 60.3332,
                                       'batch_time': 1.6575,
                                       'batch': 100},
                            'flush': {'RT': 2.7263},
                            'load': {'RT': 45.362},
                            'Locust': {'Aggregated': {'Requests': 14348,
                                                      'Fails': 5,
                                                      'RPS': 0.8,
                                                      'fail_s': 0.0,
                                                      'RT_max': 120217.14,
                                                      'RT_avg': 24905.74,
                                                      'TP50': 21000.0,
                                                      'TP99': 79000.0},
                                       'hybrid_search': {'Requests': 14348,
                                                         'Fails': 5,
                                                         'RPS': 0.8,
                                                         'fail_s': 0.0,
                                                         'RT_max': 120217.14,
                                                         'RT_avg': 24905.74,
                                                         'TP50': 21000.0,
                                                         'TP99': 79000.0}}}}}

wangting0128 avatar Feb 27 '24 07:02 wangting0128

The vector dimension seemed to be too large and there are too many scalar fields. So there will be too many small segments and the vector dim is too large. So I guess even if you run single field search it is actually slow

xiaofan-luan avatar Feb 27 '24 15:02 xiaofan-luan

Does 3 replicas all have traffic?

xiaofan-luan avatar Feb 27 '24 15:02 xiaofan-luan

Does 3 replicas all have traffic?

From the monitoring point of view, all three replicas have traffic, and the traffic is relatively balanced. 截屏2024-02-28 14 20 49

wangting0128 avatar Feb 28 '24 06:02 wangting0128

The vector dimension seemed to be too large and there are too many scalar fields. So there will be too many small segments and the vector dim is too large. So I guess even if you run single field search it is actually slow

yes, this scenario is to verify the maximum value check of field number and dim, which will generate many small segments.

image image image

The problem now is that the request latency of segcore is very low, but the final request latency of hybrid_search is very high, so need to look at where most of the time is spent.

截屏2024-02-28 14 35 39 截屏2024-02-28 14 36 13 截屏2024-02-28 14 36 31

wangting0128 avatar Feb 28 '24 06:02 wangting0128

The vector dimension seemed to be too large and there are too many scalar fields. So there will be too many small segments and the vector dim is too large. So I guess even if you run single field search it is actually slow

Under 100k data, 30 segments are generated. image

wangting0128 avatar Feb 28 '24 06:02 wangting0128

do this test in my local machine, can only load replica = 1 because of memory limit, and from this log, single request cost: image all single request means a hybrid search cost: image it shows not very big gap between segcore and final request. perf querynode when search, it shows the main cost in knowhere search image

zhagnlu avatar Mar 08 '24 09:03 zhagnlu

also do a same test in argo and replica = 3 as upper test case, https://argo-workflows.zilliz.cc/workflows/qa/fouramf-multi-vector-5c6mh?tab=workflow segcore search cost: image it shows almost near 4 seconds proxy lantecy: image it shows also about 4 seconds. so no a big gap between segcore and final request.

zhagnlu avatar Mar 08 '24 09:03 zhagnlu

compare the low cpu usage case and my normal test: low cpu usage case: image my normal case: image

the requery cost has big gap, it means the low cpu case cost a lot when requery of hybrid search process. maybe requery use chunk manager to retrieve output field, so it can explain why cpu usage is low because of IO lantency.

so need to check whether use chunk manager for the low cpu case

zhagnlu avatar Mar 08 '24 10:03 zhagnlu

compare the low cpu usage case and my normal test: low cpu usage case: image my normal case: image

the requery cost has big gap, it means the low cpu case cost a lot when requery of hybrid search process. maybe requery use chunk manager to retrieve output field, so it can explain why cpu usage is low because of IO lantency.

so need to check whether use chunk manager for the low cpu case

what field are we specified as output? This does make sense if we want to retrieve from chunkmanager and the topk and nq is huge, and we definitely has some room to improve.

Questions:

  1. how long does it take for requery? do we have any kinds of metrics?
  2. do we do segment pruning while requery?
  3. what is the network bandwitdh of current test? could this be a bandwidth limitatoon?

xiaofan-luan avatar Mar 08 '24 23:03 xiaofan-luan

Different scenarios, same problem

argo task:fouramf-multi-vector-tdclm test case name:test_hybrid_search_locust_shard1_float_dql_diskann_standalone

server:

NAME                                                              READY   STATUS        RESTARTS         AGE     IP              NODE         NOMINATED NODE   READINESS GATES
fouramf-multi-vr-tdclm-34-3023-etcd-0                             1/1     Running       0                22h     10.104.29.90    4am-node35   <none>           <none>
fouramf-multi-vr-tdclm-34-3023-milvus-standalone-6ffdb7d85ddjdr   1/1     Running       1 (22h ago)      22h     10.104.21.82    4am-node24   <none>           <none>
fouramf-multi-vr-tdclm-34-3023-minio-6766f66649-snfxk             1/1     Running       0                22h     10.104.34.84    4am-node37   <none>           <none>

image image From the monitoring point of view, the search tasks of milvus have been queued, the latency of segcore has become higher, but the CPU usage is not enough. image

client pod name: fouramf-multi-vector-tdclm-1156163776 client log: client.log 截屏2024-03-21 10 57 00

test steps:

       concurrent test and calculation of RT and QPS

        :purpose:  `shard_num=1, float_vector DQL`
            verify concurrent DQL scenario which has 4 float_vector fields(DISKANN) and 60 scalar fields

        :test steps:
            1. create collection with fields:
                'float_vector': 2048dim,
                'float_vector_1': 2048dim,
                'float_vector_2': 2048dim,
                'float_vector_3': 2048dim,
                all scalar fields: varchar max_length=10, array max_capacity=7
            2. build indexes:
                DISKANN: 'float_vector', 'float_vector_1', 'float_vector_2', 'float_vector_3'
                default_scalar_index: 'int64_1'
                INVERTED: 'id', 'bool_3'
            3. insert 100k data
            4. flush collection
            5. build indexes again using the same params
            6. load collection
                replica: 1
            7. concurrent request:
                - hybrid_search

test result:

[2024-03-20 22:36:17,825 -  INFO - fouram]: Print locust final stats. (locust_runner.py:56)
[2024-03-20 22:36:17,826 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
[2024-03-20 22:36:17,826 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-03-20 22:36:17,826 -  INFO - fouram]: grpc     hybrid_search                                                                   1886     1(0.05%) |  37864   20599   60002  37000 |    0.52        0.00 (stats.py:789)
[2024-03-20 22:36:17,826 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-03-20 22:36:17,826 -  INFO - fouram]:          Aggregated                                                                      1886     1(0.05%) |  37864   20599   60002  37000 |    0.52        0.00 (stats.py:789)
[2024-03-20 22:36:17,826 -  INFO - fouram]:  (stats.py:790)
[2024-03-20 22:36:17,829 -  INFO - fouram]: [PerfTemplate] Report data: 
{'server': {'deploy_tool': 'helm',
            'deploy_mode': 'standalone',
            'config_name': 'standalone_16c64m',
            'config': {'standalone': {'resources': {'limits': {'cpu': '16.0',
                                                               'memory': '64Gi'},
                                                    'requests': {'cpu': '9.0',
                                                                 'memory': '33Gi'}}},
                       'cluster': {'enabled': False},
                       'etcd': {'replicaCount': 1,
                                'metrics': {'enabled': True,
                                            'podMonitor': {'enabled': True}}},
                       'minio': {'mode': 'standalone',
                                 'metrics': {'podMonitor': {'enabled': True}}},
                       'pulsar': {'enabled': False},
                       'metrics': {'serviceMonitor': {'enabled': True}},
                       'log': {'level': 'debug'},
                       'image': {'all': {'repository': 'harbor.milvus.io/milvus/milvus',
                                         'tag': '2.4-20240320-19d649f0-amd64'}}},
            'host': 'fouramf-multi-vr-tdclm-34-3023-milvus.qa-milvus.svc.cluster.local',
            'port': '19530',
            'uri': ''},
 'client': {'test_case_type': 'ConcurrentClientBase',
            'test_case_name': 'test_hybrid_search_locust_shard1_float_dql_diskann_standalone',
            'test_case_params': {'dataset_params': {'metric_type': 'L2',
                                                    'dim': 2048,
                                                    'max_length': 10,
                                                    'scalars_index': {'int64_1': {},
                                                                      'id': {'index_type': 'INVERTED'},
                                                                      'bool_3': {'index_type': 'INVERTED'}},
                                                    'vectors_index': {'float_vector_1': {'index_type': 'DISKANN',
                                                                                         'index_param': {},
                                                                                         'metric_type': 'L2'},
                                                                      'float_vector_2': {'index_type': 'DISKANN',
                                                                                         'index_param': {},
                                                                                         'metric_type': 'L2'},
                                                                      'float_vector_3': {'index_type': 'DISKANN',
                                                                                         'index_param': {},
                                                                                         'metric_type': 'L2'}},
                                                    'scalars_params': {'array_int8_1': {'params': {'max_capacity': 7}},
                                                                       'array_int16_1': {'params': {'max_capacity': 7}},
                                                                       'array_int32_1': {'params': {'max_capacity': 7}},
                                                                       'array_int64_1': {'params': {'max_capacity': 7}},
                                                                       'array_double_1': {'params': {'max_capacity': 7}},
                                                                       'array_float_1': {'params': {'max_capacity': 7}},
                                                                       'array_varchar_1': {'params': {'max_capacity': 7}},
                                                                       'array_bool_1': {'params': {'max_capacity': 7}},
                                                                       'array_int8_2': {'params': {'max_capacity': 7}},
                                                                       'array_int16_2': {'params': {'max_capacity': 7}},
                                                                       'array_int32_2': {'params': {'max_capacity': 7}},
                                                                       'array_int64_2': {'params': {'max_capacity': 7}},
                                                                       'array_double_2': {'params': {'max_capacity': 7}},
                                                                       'array_float_2': {'params': {'max_capacity': 7}},
                                                                       'array_varchar_2': {'params': {'max_capacity': 7}},
                                                                       'array_bool_2': {'params': {'max_capacity': 7}},
                                                                       'array_int8_3': {'params': {'max_capacity': 7}},
                                                                       'array_int16_3': {'params': {'max_capacity': 7}},
                                                                       'array_int32_3': {'params': {'max_capacity': 7}},
                                                                       'array_int64_3': {'params': {'max_capacity': 7}},
                                                                       'array_double_3': {'params': {'max_capacity': 7}},
                                                                       'array_float_3': {'params': {'max_capacity': 7}},
                                                                       'array_varchar_3': {'params': {'max_capacity': 7}},
                                                                       'array_bool_3': {'params': {'max_capacity': 7}}},
                                                    'dataset_name': 'local',
                                                    'dataset_size': 1500000,
                                                    'ni_per': 100},
                                 'collection_params': {'other_fields': ['float_vector_1',
                                                                        'float_vector_2',
                                                                        'float_vector_3',
                                                                        'int8_1',
                                                                        'int16_1',
                                                                        'int32_1',
                                                                        'int64_1',
                                                                        'double_1',
                                                                        'float_1',
                                                                        'varchar_1',
                                                                        'bool_1',
                                                                        'json_1',
                                                                        'array_int8_1',
                                                                        'array_int16_1',
                                                                        'array_int32_1',
                                                                        'array_int64_1',
                                                                        'array_double_1',
                                                                        'array_float_1',
                                                                        'array_varchar_1',
                                                                        'array_bool_1',
                                                                        'int8_2',
                                                                        'int16_2',
                                                                        'int32_2',
                                                                        'int64_2',
                                                                        'double_2',
                                                                        'float_2',
                                                                        'varchar_2',
                                                                        'bool_2',
                                                                        'json_2',
                                                                        'array_int8_2',
                                                                        'array_int16_2',
                                                                        'array_int32_2',
                                                                        'array_int64_2',
                                                                        'array_double_2',
                                                                        'array_float_2',
                                                                        'array_varchar_2',
                                                                        'array_bool_2',
                                                                        'int8_3',
                                                                        'int16_3',
                                                                        'int32_3',
                                                                        'int64_3',
                                                                        'double_3',
                                                                        'float_3',
                                                                        'varchar_3',
                                                                        'bool_3',
                                                                        'json_3',
                                                                        'array_int8_3',
                                                                        'array_int16_3',
                                                                        'array_int32_3',
                                                                        'array_int64_3',
                                                                        'array_double_3',
                                                                        'array_float_3',
                                                                        'array_varchar_3',
                                                                        'array_bool_3',
                                                                        'varchar_tail_1',
                                                                        'varchar_tail_2',
                                                                        'varchar_tail_3',
                                                                        'varchar_tail_4',
                                                                        'varchar_tail_5',
                                                                        'varchar_tail_6',
                                                                        'varchar_tail_7',
                                                                        'varchar_tail_8'],
                                                       'shards_num': 1},
                                 'resource_groups_params': {'reset': False},
                                 'database_user_params': {'reset_rbac': False,
                                                          'reset_db': False},
                                 'index_params': {'index_type': 'DISKANN',
                                                  'index_param': {}},
                                 'concurrent_params': {'concurrent_number': 20,
                                                       'during_time': '1h',
                                                       'interval': 20,
                                                       'spawn_rate': None},
                                 'concurrent_tasks': [{'type': 'hybrid_search',
                                                       'weight': 1,
                                                       'params': {'nq': 1,
                                                                  'top_k': 100,
                                                                  'reqs': [{'search_param': {'search_list': 30},
                                                                            'anns_field': 'float_vector',
                                                                            'expr': 'id '
                                                                                    '> '
                                                                                    '150000',
                                                                            'top_k': 10},
                                                                           {'search_param': {'search_list': 100},
                                                                            'anns_field': 'float_vector_1',
                                                                            'expr': 'int64_1 '
                                                                                    '<= '
                                                                                    '1350000',
                                                                            'top_k': 50},
                                                                           {'search_param': {'search_list': 1500},
                                                                            'anns_field': 'float_vector_2',
                                                                            'expr': 'array_length(array_int8_2) '
                                                                                    '== '
                                                                                    '7',
                                                                            'top_k': 1000},
                                                                           {'search_param': {'search_list': 20000},
                                                                            'anns_field': 'float_vector_3',
                                                                            'expr': 'bool_3 '
                                                                                    '== '
                                                                                    'True',
                                                                            'top_k': 16384}],
                                                                  'rerank': {'RRFRanker': []},
                                                                  'output_fields': ['float_vector'],
                                                                  'ignore_growing': False,
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'timeout': 60,
                                                                  'random_data': True}}]},
            'run_id': 2024032084617105,
            'datetime': '2024-03-20 04:21:01.809449',
            'client_version': '2.2'},
 'result': {'test_result': {'index': {'RT': 14677.199,
                                      'float_vector_1': {'RT': 16331.9596},
                                      'float_vector_2': {'RT': 19644.1707},
                                      'float_vector_3': {'RT': 644.6092},
                                      'int64_1': {'RT': 1.0166},
                                      'id': {'RT': 1.0174},
                                      'bool_3': {'RT': 1.016}},
                            'insert': {'total_time': 4951.575,
                                       'VPS': 302.9339,
                                       'batch_time': 0.3301,
                                       'batch': 100},
                            'flush': {'RT': 3.5238},
                            'load': {'RT': 173.0584},
                            'Locust': {'Aggregated': {'Requests': 1886,
                                                      'Fails': 1,
                                                      'RPS': 0.52,
                                                      'fail_s': 0.0,
                                                      'RT_max': 60002.48,
                                                      'RT_avg': 37864.57,
                                                      'TP50': 37000.0,
                                                      'TP99': 51000.0},
                                       'hybrid_search': {'Requests': 1886,
                                                         'Fails': 1,
                                                         'RPS': 0.52,
                                                         'fail_s': 0.0,
                                                         'RT_max': 60002.48,
                                                         'RT_avg': 37864.57,
                                                         'TP50': 37000.0,
                                                         'TP99': 51000.0}}}}}

@zhagnlu

wangting0128 avatar Mar 21 '24 03:03 wangting0128

this issue has two problems, low cpu usage and high latency. actually, it has two different situations considering querynode cpu pressure

first one: both low cpu usage for concurrency = 1 and concurrency=20 this situation usually happens querynode has more cpu resource than needed, although concurrency=20. querynode has no task in queue, task can be done quickly, and cpu usage is variable, usually, more cpu usage more concurrency, but not not full usage just as show: image under same concurrency, cpu usage may different for different querynode, cpu usage is related to request that sended to this querynode, more request, more cpu usage, as two picture shows below: image image

second one: high search latency for concurrency =20 this situation usually happens when querynode has less cpu usage than needed. querynode can not process tasks in time. as figure below shows, 16cores, and almost 16 tasks concurrency. every task has segments size subtasks. image if collections has many segments, because execute pool only have corenum * 2 threads. segment task need to wait available threads. so segment latency is multi bigger than segcore latency. as figure below shows: image As result, the E2E latency may become very bigger although segcore latency differ a little.

zhagnlu avatar Mar 29 '24 04:03 zhagnlu

if it is a 16core machine, none of the pod reach usage more than 60%, does that make sense? I would expect at least one mode has more than 80% cpu? could it be due to other. bottlenecks like lock?

xiaofan-luan avatar Mar 29 '24 04:03 xiaofan-luan

if it is a 16core machine, none of the pod reach usage more than 60%, does that make sense? I would expect at least one mode has more than 80% cpu? could it be due to other. bottlenecks like lock?

if client not give enough pressure, it is nomal.
If client pressure is enough, I test this case again, and can not reproduce this low cpu usage, it is almost 16 cores. https://grafana-4am.zilliz.cc/d/uLf5cJ3Ga/milvus2-0?orgId=1&var-datasource=P1809F7CD0C75ACF3&var-cluster=&var-namespace=qa-milvus&var-instance=fouramf-multi-vr-8xpng-100-4981&var-collection=All&var-app_name=milvus&from=1711587644382&to=1711677452270 image

for this less than 80% situations, I guess maybe diskann problem, other cases indicate diskann index may blocked by IO, memory and algorithm argument. and need manual modify to get a high cpu usage

zhagnlu avatar Apr 01 '24 02:04 zhagnlu

@wangting0128 can you confirm this a diskann test? diskann bottle neck is not in cpu so I guess the low cpu usage is reasonabel

xiaofan-luan avatar Apr 01 '24 04:04 xiaofan-luan

@wangting0128 can you confirm this a diskann test? diskann bottle neck is not in cpu so I guess the low cpu usage is reasonabel

yes. This is a unique problem of diskann. It has been verified on the HNSW index that it can fill up the CPU.

I've confirmed with colleagues at knowhere that the IOPS in the diskann scenario is very high, so the bottleneck is not in the disk IOPS. The disk speed is too fast, causing the CPU usage to not increase. image

wangting0128 avatar Apr 01 '24 06:04 wangting0128

The IO performance of the test machine is 23x better than AWS's 1cu. And default global search thread pool size of knowhere is (4 * cpu). If IOPS is too low, the overhead of context switching of four threads on one core is negligible and can improve IO concurrency. If IOPS is too high, IO requests return quickly, and the thread context switching overhead cannot be ignored.

cqy123456 avatar Apr 01 '24 08:04 cqy123456

The IO performance of the test machine is 23x better than AWS's 1cu. And default global search thread pool size of knowhere is (4 * cpu). If IOPS is too low, the overhead of context switching of four threads on one core is negligible and can improve IO concurrency. If IOPS is too high, IO requests return quickly, and the thread context switching overhead cannot be ignored.

That's why I guess aio or io_uring still gonna to be important. Because it's always hard to manage thread numbers in BIO mode

xiaofan-luan avatar Apr 01 '24 16:04 xiaofan-luan

@wangting0128 I saw the field float_vector in the search request but not in the vectors_index in both of the "test results" sections, is it intended?

PwzXxm avatar Apr 16 '24 11:04 PwzXxm

@wangting0128 I saw the field float_vector in the search request but not in the vectors_index in both of the "test results" sections, is it intended?

Are you asking about the index time is created on the float_vector filed?

float_vector field: 'index': {'RT': 14677.199,

wangting0128 avatar Apr 16 '24 11:04 wangting0128

@wangting0128 I saw the field float_vector in the search request but not in the vectors_index in both of the "test results" sections, is it intended?

Are you asking about the index time is created on the float_vector filed?

float_vector field: 'index': {'RT': 14677.199,

Not really. I mean why there is no index on the float_vector field image

But has search requests on this field, will it be just brute-force search or it has default index type? image

PwzXxm avatar Apr 16 '24 11:04 PwzXxm

@wangting0128 I saw the field float_vector in the search request but not in the vectors_index in both of the "test results" sections, is it intended?

Are you asking about the index time is created on the float_vector filed? float_vector field: 'index': {'RT': 14677.199,

Not really. I mean why there is no index on the float_vector field image

But has search requests on this field, will it be just brute-force search or it has default index type? image

float_vector index is as follow:

'index_params': {'index_type': 'DISKANN',
                                                  'index_param': {}},

wangting0128 avatar Apr 16 '24 12:04 wangting0128

@wangting0128 can you confirm this a diskann test? diskann bottle neck is not in cpu so I guess the low cpu usage is reasonabel

yes. This is a unique problem of diskann. It has been verified on the HNSW index that it can fill up the CPU.

I've confirmed with colleagues at knowhere that the IOPS in the diskann scenario is very high, so the bottleneck is not in the disk IOPS. The disk speed is too fast, causing the CPU usage to not increase. image

Looking at the date&time of CPU Usage at the top and the Pod IO at the bottom of the dashboard. The IOPS barely changes whenever the concurrency. Guessing it already reached the max IOPS at that time on that instance, even when the concurrent number is 1 since each task is heavy. The I/O queued up causes the Search Segment Latency to be multi-10 seconds and around 40 segments are causing Proxy Search Latency to be around 1min.

PwzXxm avatar Apr 19 '24 06:04 PwzXxm

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

stale[bot] avatar Jun 10 '24 06:06 stale[bot]