milvus
milvus copied to clipboard
[Bug]: Search results become less and less if keep deleting the search results
Is there an existing issue for this?
- [X] I have searched the existing issues
Environment
- Milvus version: 2.2.0-20230116-7e2121e6
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka): pulsar
Current Behavior
i have one collection with 13+ million vectors, and i want to delete 2.6 million vectors. So I
- search a random vector with top=10000
- get the 10000 ids from search results
- delete the entities with the ids
- repeat search and delete for 260 times search results becomes less and less, and to 0 after deleting for several rounds.
Expected Behavior
search always returns top=10000 reproduce code:
for r in range(rounds):
search_vector = [[random.random() for _ in range(dim)] for _ in range(1)]
results = c.search(data=search_vector, anns_field=vector_field_name,
param=search_params, limit=nb)
for hits in results:
ids = hits.ids
c.delete(expr=f"{primary_field_name} in {ids}")
logging.info(f"deleted {len(ids)} entities")
Steps To Reproduce
check reproduce code above
Milvus Log
check the logs around 01/17/2023 08:01:40 AM
yanliang-cluster-1cu-milvus-datanode-69f6f588c5-g4x2h 1/1 Running 0 43h
yanliang-cluster-1cu-milvus-indexnode-6d6554f9fc-dbgqj 1/1 Running 0 43h
yanliang-cluster-1cu-milvus-mixcoord-5c45bf7fc-p5hpq 1/1 Running 0 43h
yanliang-cluster-1cu-milvus-proxy-7bc7bfc6c4-8s7t9 1/1 Running 0 43h
yanliang-cluster-1cu-milvus-querynode-5784dcf45d-d69nb 1/1 Running 0 43h
yanliang-cluster-1cu-milvus-querynode-5784dcf45d-w86g6 1/1 Running 0 43h
Anything else?
No response
/assign @liliu-z @cydrain /unassign
@yanliang567 what's the index type and search param ?
"index_type": "HNSW", "metric_type": "IP", "params": {"M": 8, "efConstruction": 96} search_params = {"metric_type": "IP", "params": {"ef": 10000}}
test knowhere with sift1M dataset, till the last iteration (all data have been removed), knowhere can always return 10000 valid result. (bt means bitset count)
same for glove-200 dataset
change metric type to "L2", script runs as expected
do normalize with IP metric type, following script can always delete 10000 entities 21785_create_n_insert_normalize.py.txt
so this issue is not a real issue, IP metric type MUST do normalization first @yanliang567
/assign @yanliang567
so this issue is not a real issue, IP metric type MUST do normalization first @yanliang567
Why IP has to be normalized? If ip normalized that would be cosine by the way
@liliu-z
We only support IP for now, and Cosine is not supported yet. IP without normalization is workable but with super low recall (it doesn't make sense in Mathematics). IP + normalization = Cosine, but we didn't support it for now. This is the reason why we recommend users do normalization before using IP
Getting back to this issue. We still need to investigate what happen since it is not as expected.
@cydrain Can you help do a further check on why this happens? Appreciate it!
@cydrain Can you help do a further check on why this happens? Appreciate it!
ok
This issue only exists for HNSW, not for IVF_FLAT or IVF_SQ8
/assign @hhy3
It is because IP is not a distance, so when using IP to build hnsw graph, the graph is not fully connected. So starting from ep it can only find points nearby.
so all graph with IP should have similar problem? build the graph with pre cluster might help on it
so all graph with IP should have similar problem? build the graph with pre cluster might help on it
To my understanding, when we use IP in HNSW, the connectivity of the graph depends on datasets. We can try wether pre-clustering can mitigate this, but in theory IP + graph is a not a make sense combination. Will work on Cosine metric type very soon.
For graph based index (such as HNSW), the distance must obey this rule:
if A close to B, and B close to C ==> A close to C (distance is conductive)
For IP distance (without normalization), above rule is disobeyed, and makes HNSW graph not fully connected.
@yanliang567 Milvus 2.3 can support COSINE now, can you retest this case with COSINE metric type ? And suggest to change the Milestone to 2.3
OK, will do
not reproduced on master-20230619-a6310050 with cosine