milvus icon indicating copy to clipboard operation
milvus copied to clipboard

[Bug]: [restful v2]when nq > 1, the hybrid search can return all results, but it flattens them

Open zhuwenxing opened this issue 1 year ago • 4 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Environment

- Milvus version:master/2.4
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

when nq > 1, the hybrid search can return all results, but it flattens them

[1,2,...20]

Expected Behavior

[ [1_1,...1_10], [2_1,...2_10] ]

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

zhuwenxing avatar Apr 18 '24 07:04 zhuwenxing

/assign @PowderLi

zhuwenxing avatar Apr 18 '24 08:04 zhuwenxing

/assign @czs007

PowderLi avatar Apr 24 '24 08:04 PowderLi

how to produce? @zhuwenxing could you please provide an example?

PowderLi avatar May 24 '24 14:05 PowderLi

see https://github.com/zhuwenxing/milvus/blob/9f8e89000d8ef42aaac1b5d025059930e8898658/tests/restful_client_v2/testcases/test_vector_operations.py#L1234

image

zhuwenxing avatar May 24 '24 16:05 zhuwenxing

what about the result of python sdk? while nq > 1, search api also return [1,2,...20], is it unexpected?

PowderLi avatar May 25 '24 12:05 PowderLi

python sdk

    res = collection.search(
        vectors[-2:], "float_vector", search_params, topK,
        "int64 > 100", output_fields=["int64", "float"], timeout=TIMEOUT
    )
    t1 = time.time()
    print(f"search cost  {t1 - t0:.4f} seconds")
    # show result
    for hits in res:
        for hit in hits:
            # Get value of the random value field for search result
            print(hit, hit.entity.get("float"))

so, in my opinion, restful should be like following

Expected Behavior [ [1_1,...1_10], [2_1,...2_10] ]

zhuwenxing avatar May 27 '24 02:05 zhuwenxing

I think the different is: Python SDK can wrap on top of the proto, so it's more efficient driven. But restful need to response ona way easier to understand?

xiaofan-luan avatar May 27 '24 03:05 xiaofan-luan

Restful response need split for nq>1. Or we will lose info in a flatterned result.(considering one return does not meet enough limit counts)

yiwen92 avatar May 27 '24 05:05 yiwen92

BTW Is there any proved case say we must need nq>1 for hybrid search? @xiaofan-luan

yiwen92 avatar May 27 '24 05:05 yiwen92

/assign

smellthemoon avatar Dec 06 '24 02:12 smellthemoon

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

stale[bot] avatar Feb 23 '25 05:02 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

stale[bot] avatar Mar 30 '25 19:03 stale[bot]

It was resolved through another approach - the returned result is still a list, but an additional topk list is also output, requiring users to perform the segmentation themselves based on the topk list.

zhuwenxing avatar Mar 31 '25 02:03 zhuwenxing