milvus
milvus copied to clipboard
feat: Support Search By PK
issue: #39157
Overview: Support search by PK by resolving IDs to vectors on Proxy side. Upgrade go-api to adapt to new proto definitions.
Design:
- Upgrade milvus-proto/go-api to latest master.
- Implement handleIfSearchByPK in Proxy: resolve IDs to vectors via internal Query, then rewrite SearchRequest.
- Adapt to 'SearchInput' oneof field in SearchRequest across client and handlers.
- Fix binary vector stride calculation bug in placeholder utils.
Compatibility:
- Old Pymilvus can still work w/o this feature
What is included:
- Dense and Sparse
- Multi vector fields
- Rejection on BM25
What is not include:
- Hybrid Search
- EmbeddingList
- Restful API
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: liliu-z
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [liliu-z]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
[ci-v2-notice] Notice: We are gradually rolling out the new ci-v2 system.
- Legacy CI jobs remain unaffected, you can just ignore ci-v2 if you don't want to run it.
- Additional "ci-v2/*" checkers will run for this PR to ensure the new ci-v2 system is working as expected.
- For tests that exist in both v1 and v2, passing in either system is considered PASS.
To rerun ci-v2 checks, comment with:
- /ci-rerun-code-check // for ci-v2/code-check
- /ci-rerun-build // for ci-v2/build
- /ci-rerun-ut-integration // for ci-v2/ut-integration
- /ci-rerun-ut-go // for ci-v2/ut-go
- /ci-rerun-ut-cpp // for ci-v2/ut-cpp
- /ci-rerun-ut // for all ci-v2/ut-integration, ci-v2/ut-go, ci-v2/ut-cpp
- /ci-rerun-e2e-arm // for ci-v2/e2e-arm [master branch only]
- /ci-rerun-e2e-default // for ci-v2/e2e-default [master branch only]
If you have any questions or requests, please contact @zhikunyao.
Codecov Report
:x: Patch coverage is 55.85106% with 83 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 73.38%. Comparing base (0171511) to head (cb7cbf4).
:warning: Report is 27 commits behind head on master.
:x: Your patch check has failed because the patch coverage (55.85%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. :x: Your project check has failed because the head coverage (73.38%) is below the target coverage (77.00%). You can increase the head coverage or adjust the target coverage.
Additional details and impacted files
@@ Coverage Diff @@
## master #45820 +/- ##
==========================================
- Coverage 76.05% 73.38% -2.68%
==========================================
Files 1890 1365 -525
Lines 295336 212971 -82365
==========================================
- Hits 224618 156284 -68334
+ Misses 63289 49231 -14058
- Partials 7429 7456 +27
| Components | Coverage Δ | |
|---|---|---|
| Client | 78.18% <100.00%> (+0.01%) |
:arrow_up: |
| Core | ∅ <ø> (∅) |
|
| Go | 74.03% <49.39%> (-0.10%) |
:arrow_down: |
| Files with missing lines | Coverage Δ | |
|---|---|---|
| client/milvusclient/read_options.go | 77.82% <100.00%> (+0.20%) |
:arrow_up: |
| ...nternal/distributed/proxy/httpserver/handler_v1.go | 88.55% <100.00%> (+0.02%) |
:arrow_up: |
| ...nternal/distributed/proxy/httpserver/handler_v2.go | 86.46% <100.00%> (+0.02%) |
:arrow_up: |
| tests/integration/util_query.go | 69.44% <100.00%> (+0.49%) |
:arrow_up: |
| internal/proxy/task_search.go | 69.36% <50.00%> (ø) |
|
| internal/distributed/proxy/httpserver/handler.go | 97.44% <50.00%> (-0.45%) |
:arrow_down: |
| pkg/util/funcutil/placeholdergroup.go | 48.08% <0.00%> (ø) |
|
| internal/proxy/impl.go | 71.82% <43.93%> (-0.88%) |
:arrow_down: |
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
overall it looks good to me.
might need some more monitoring metrics
also need to unify vector -> bytevector function. (restful need to support byte vector search as well)
overall it looks good to me.
might need some more monitoring metrics
Any suggestion on this. For latency, query & search will be recorded separately. I can help add a unified latency and label out the subquery and subsearch, also, do this for hybrid search in the next PR
also need to unify vector -> bytevector function. (restful need to support byte vector search as well)
The restfulv2 has already support binary vector.
I guess you meant merging vector2Bytes and binaryVector2Bytes. These two are only for restfultV1, no one use but code is still there (will find a time to clean it out).
/ci-rerun-ut-integration
/ci-rerun-e2e-default
/ci-rerun-ut-go
/ci-rerun-ut-go
@liliu-z go-sdk check failed, comment rerun go-sdk can trigger the job again.
rerun go-sdk
@liliu-z cpu-e2e job failed, comment /run-cpu-e2e can trigger the job again.