milvus icon indicating copy to clipboard operation
milvus copied to clipboard

[Bug]: [Nightly]Binary search returned 0 result with tanimoto flat index

Open NicoYuan1986 opened this issue 1 year ago • 5 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Environment

- Milvus version:081572d
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka):    kafka
- SDK version(e.g. pymilvus v2.0.0rc2):2.3.0.dev48
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

Binary search returned 0 result with tanimoto flat index

[2023-03-26T19:36:41.865Z] [2023-03-26 18:26:13 - ERROR - ci_test]: search_results_check: limit(topK) searched (0) is not equal with expected (2) (func_check.py:253)

Expected Behavior

pass

Steps To Reproduce

No response

Milvus Log

  1. link: https://jenkins.milvus.io:18080/blue/organizations/jenkins/Milvus%20Nightly%20CI/detail/master/323/pipeline/155/
  2. log: artifacts-milvus-distributed-kafka-nightly-323-pymilvus-e2e-logs.tar.gz
  3. collection name: search_collection_7ElTx1fX
  4. all failed cases:
test_search_binary_tanimoto_flat_index
test_search_binary_without_flush[TANIMOTO-True]

Anything else?

No response

NicoYuan1986 avatar Mar 27 '23 02:03 NicoYuan1986

case:

[2023-03-26T19:36:41.863Z] _ TestCollectionSearch.test_search_binary_tanimoto_flat_index[2-32-False-False-True-BIN_IVF_FLAT] _
[2023-03-26T19:36:41.863Z] [gw5] linux -- Python 3.8.16 /usr/local/bin/python3
[2023-03-26T19:36:41.863Z] 
[2023-03-26T19:36:41.863Z] self = <test_search.TestCollectionSearch object at 0x7fc45c69dd90>, nq = 2
[2023-03-26T19:36:41.863Z] dim = 32, auto_id = False, _async = False, index = 'BIN_IVF_FLAT'
[2023-03-26T19:36:41.863Z] is_flush = True
[2023-03-26T19:36:41.863Z] 
[2023-03-26T19:36:41.863Z]     @pytest.mark.tags(CaseLabel.L2)
[2023-03-26T19:36:41.863Z]     @pytest.mark.parametrize("index", ["BIN_FLAT", "BIN_IVF_FLAT"])
[2023-03-26T19:36:41.863Z]     def test_search_binary_tanimoto_flat_index(self, nq, dim, auto_id, _async, index, is_flush):
[2023-03-26T19:36:41.863Z]         """
[2023-03-26T19:36:41.863Z]         target: search binary_collection, and check the result: distance
[2023-03-26T19:36:41.863Z]         method: compare the return distance value with value computed with TANIMOTO
[2023-03-26T19:36:41.863Z]         expected: the return distance equals to the computed value
[2023-03-26T19:36:41.863Z]         """
[2023-03-26T19:36:41.863Z]         # 1. initialize with binary data
[2023-03-26T19:36:41.863Z]         collection_w, _, binary_raw_vector, insert_ids = self.init_collection_general(prefix, True, 2,
[2023-03-26T19:36:41.863Z]                                                                                       is_binary=True,
[2023-03-26T19:36:41.863Z]                                                                                       auto_id=auto_id,
[2023-03-26T19:36:41.863Z]                                                                                       dim=dim,
[2023-03-26T19:36:41.863Z]                                                                                       is_index=False,
[2023-03-26T19:36:41.863Z]                                                                                       is_flush=is_flush)[0:4]
[2023-03-26T19:36:41.863Z]         log.info("auto_id= %s, _async= %s" % (auto_id, _async))
[2023-03-26T19:36:41.863Z]         # 2. create index
[2023-03-26T19:36:41.863Z]         default_index = {"index_type": index, "params": {"nlist": 128}, "metric_type": "TANIMOTO"}
[2023-03-26T19:36:41.863Z]         collection_w.create_index("binary_vector", default_index)
[2023-03-26T19:36:41.863Z]         collection_w.load()
[2023-03-26T19:36:41.863Z]         # 3. compute the distance
[2023-03-26T19:36:41.863Z]         query_raw_vector, binary_vectors = cf.gen_binary_vectors(3000, dim)
[2023-03-26T19:36:41.863Z]         distance_0 = cf.tanimoto(query_raw_vector[0], binary_raw_vector[0])
[2023-03-26T19:36:41.863Z]         distance_1 = cf.tanimoto(query_raw_vector[0], binary_raw_vector[1])
[2023-03-26T19:36:41.863Z]         # 4. search and compare the distance
[2023-03-26T19:36:41.863Z]         search_params = {"metric_type": "TANIMOTO", "params": {"nprobe": 10}}
[2023-03-26T19:36:41.863Z] >       res = collection_w.search(binary_vectors[:nq], "binary_vector",
[2023-03-26T19:36:41.863Z]                                   search_params, default_limit, "int64 >= 0",
[2023-03-26T19:36:41.863Z]                                   _async=_async,
[2023-03-26T19:36:41.863Z]                                   check_task=CheckTasks.check_search_results,
[2023-03-26T19:36:41.863Z]                                   check_items={"nq": nq,
[2023-03-26T19:36:41.863Z]                                                "ids": insert_ids,
[2023-03-26T19:36:41.863Z]                                                "limit": 2,
[2023-03-26T19:36:41.863Z]                                                "_async": _async})[0]

NicoYuan1986 avatar Mar 27 '23 02:03 NicoYuan1986

should be fixed with knowhere-v2.1.1 related PR: https://github.com/milvus-io/knowhere/pull/783

cydrain avatar Mar 27 '23 10:03 cydrain

similar as #22403

cydrain avatar Mar 27 '23 10:03 cydrain

reproduced twice. milvus: d55c860 pymilvus: 2.3.0.dev48

  1. link: https://jenkins.milvus.io:18080/blue/organizations/jenkins/Milvus%20Nightly%20CI/detail/master/324/pipeline/187
  2. log: artifacts-milvus-distributed-pulsar-nightly-324-pymilvus-e2e-logs.tar.gz
  3. failed time:
[2023-03-27T18:28:10.970Z] [gw3] [ 45%] FAILED testcases/test_search.py::TestCollectionSearch::test_search_binary_tanimoto_flat_index[500-32-False-False-True-BIN_FLAT]
[2023-03-27T18:29:24.948Z] [gw3] [ 46%] FAILED testcases/test_search.py::TestCollectionSearch::test_search_binary_tanimoto_flat_index[500-32-True-True-True-BIN_IVF_FLAT]

NicoYuan1986 avatar Mar 28 '23 02:03 NicoYuan1986

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

stale[bot] avatar Apr 27 '23 11:04 stale[bot]