milvus icon indicating copy to clipboard operation
milvus copied to clipboard

[Bug]: metric type JACCARD is allowed for float vector FLAT index

Open congqixia opened this issue 1 year ago • 1 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Environment

- Milvus version:113f9a0
- Deployment mode(standalone or cluster): both
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

Create FLAT index on float vector field with JACCARD metric type will not return error

Expected Behavior

JACCARD metric type shall only be allowed on binary vector

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

congqixia avatar May 12 '23 07:05 congqixia

/assign @cydrain /unassign

yanliang567 avatar May 12 '23 10:05 yanliang567

I see both pymilvus and Milvus will report error on this scenario:

pymilvus

  File "/home/caiyd/vec/pymilvus/pymilvus/client/grpc_handler.py", line 572, in create_index
    check_index_params(params)
  File "/home/caiyd/vec/pymilvus/pymilvus/client/check.py", line 391, in check_index_params
    raise ParamError(message=f"Invalid metric_type: {params['metric_type']}, which does not match the index type: {params['index_type']}")
pymilvus.exceptions.ParamError: <ParamError: (code=1, message=Invalid metric_type: JACCARD, which does not match the index type: FLAT)>

Milvus

pymilvus.exceptions.MilvusException: <MilvusException: (code=1, message=fail to search on all shard leaders, err=fail to Search, QueryNode ID=3, reason=worker(3) query failed: [UnexpectedError] Assert "is_float_data_type == is_float_metric_type" at /home/caiyd/vec/milvus/internal/core/src/query/SearchBruteForce.cpp:34
 => [BruteForceSearch] Data type and metric type miss-match)>

cydrain avatar May 15 '23 07:05 cydrain

@congqixia I need your script to re-produce this issue

cydrain avatar May 15 '23 07:05 cydrain

@cydrain this problem is discovered in this e2e test in go sdk: https://github.com/milvus-io/milvus-sdk-go/blob/ce8e641c0b5e856f34fa2d8a8b6865789c84e075/test/testcases/index_test.go#L235-L241 You could uncomment these lines and run this command in test/testcases folder

go test -run "TestCreateIndexInvalidParams" -v --tags L0

congqixia avatar May 15 '23 07:05 congqixia

similar check has been made before searching https://github.com/milvus-io/milvus/pull/20838

cydrain avatar May 15 '23 09:05 cydrain

For real index (such as: HNSW), we also have checked the param validation before creating index

func (c *binaryVectorBaseChecker) CheckTrain(params map[string]string) error
func (c *floatVectorBaseChecker) CheckTrain(params map[string]string) error 

we will see error like this:

pymilvus.exceptions.MilvusException: <MilvusException: (code=1, message=metric type not found or not supported, supported: [L2 IP COSINE])>

cydrain avatar May 15 '23 09:05 cydrain

As a conclusion, we have already checked the param validation:

  1. before creating real index
  2. before searching

For this case, creating "FLAT" index will not create a real index, so it bypasses above checking. Suggest to update the test case to create "HNSW" instead.

cydrain avatar May 15 '23 09:05 cydrain

@cydrain I understand the code implements, but I still suggest not skip the checking for FLAT index, which benefits:

  1. avoid users confusion
  2. save users time
  3. milvus behaves the same for all the index types

yanliang567 avatar May 15 '23 11:05 yanliang567