milvus icon indicating copy to clipboard operation
milvus copied to clipboard

feat: support json index

Open sunby opened this issue 1 year ago • 43 comments

https://github.com/milvus-io/milvus/issues/35528

This PR adds json index support for json and dynamic fields. Now you can only do unary query like 'a["b"] > 1' using this index. We will support more filter type later.

basic usage:

collection.create_index("json_field", {"index_type": "INVERTED",
    "params": {"json_cast_type": DataType.STRING, "json_path":
'json_field["a"]["b"]'}})

There are some limits to use this index:

  1. If a record does not have the json path you specify, it will be ignored and there will not be an error.
  2. If a value of the json path fails to be cast to the type you specify, it will be ignored and there will not be an error.
  3. A specific json path can have only one json index.
  4. If you try to create more than one json indexes for one json field, sdk(pymilvus<=2.4.7) may return immediately because of internal implementation. This will be fixed in a later version.

sunby avatar Oct 10 '24 10:10 sunby

@sunby Please associate the related issue to the body of your Pull Request. (eg. “issue: #”)

mergify[bot] avatar Oct 10 '24 10:10 mergify[bot]

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

mergify[bot] avatar Oct 10 '24 11:10 mergify[bot]

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Oct 10 '24 11:10 mergify[bot]

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

mergify[bot] avatar Oct 12 '24 04:10 mergify[bot]

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

mergify[bot] avatar Oct 31 '24 09:10 mergify[bot]

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

mergify[bot] avatar Oct 31 '24 10:10 mergify[bot]

Codecov Report

Attention: Patch coverage is 78.52029% with 90 lines in your changes missing coverage. Please review.

Project coverage is 80.08%. Comparing base (53a4207) to head (49435e8). Report is 30 commits behind head on master.

Files with missing lines Patch % Lines
internal/datacoord/index_meta.go 57.14% 8 Missing and 4 partials :warning:
internal/datacoord/index_service.go 84.81% 9 Missing and 3 partials :warning:
internal/core/src/exec/expression/UnaryExpr.cpp 81.39% 8 Missing :warning:
...rnal/core/src/segcore/ChunkedSegmentSealedImpl.cpp 36.36% 7 Missing :warning:
internal/util/indexparamcheck/inverted_checker.go 30.00% 6 Missing and 1 partial :warning:
internal/core/src/index/IndexFactory.cpp 80.00% 6 Missing :warning:
internal/core/src/index/JsonInvertedIndex.cpp 70.00% 6 Missing :warning:
internal/querynodev2/services.go 40.00% 5 Missing and 1 partial :warning:
internal/core/src/index/InvertedIndexTantivy.h 76.47% 4 Missing :warning:
internal/core/src/index/JsonInvertedIndex.h 81.81% 4 Missing :warning:
... and 7 more

:x: Your patch status has failed because the patch coverage (78.52%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #36750      +/-   ##
==========================================
- Coverage   80.10%   80.08%   -0.03%     
==========================================
  Files        1482     1490       +8     
  Lines      205063   206217    +1154     
==========================================
+ Hits       164275   165144     +869     
- Misses      34886    35098     +212     
- Partials     5902     5975      +73     
Components Coverage Δ
Client 79.24% <ø> (-0.04%) :arrow_down:
Core 69.50% <80.00%> (+0.11%) :arrow_up:
Go 81.79% <76.53%> (-0.05%) :arrow_down:
Files with missing lines Coverage Δ
internal/core/src/bitset/common.h 94.11% <100.00%> (ø)
internal/core/src/common/FieldDataInterface.h 56.76% <100.00%> (+1.34%) :arrow_up:
internal/core/src/common/FieldMeta.h 94.82% <ø> (ø)
...e/src/exec/expression/BinaryArithOpEvalRangeExpr.h 100.00% <100.00%> (ø)
...nternal/core/src/exec/expression/BinaryRangeExpr.h 94.00% <100.00%> (ø)
internal/core/src/exec/expression/ExistsExpr.h 100.00% <100.00%> (ø)
internal/core/src/exec/expression/Expr.h 70.00% <100.00%> (+0.71%) :arrow_up:
...ternal/core/src/exec/expression/JsonContainsExpr.h 100.00% <100.00%> (ø)
internal/core/src/exec/expression/NullExpr.h 100.00% <100.00%> (ø)
internal/core/src/exec/expression/TermExpr.h 85.71% <100.00%> (ø)
... and 27 more

... and 111 files with indirect coverage changes

codecov[bot] avatar Oct 31 '24 10:10 codecov[bot]

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Oct 31 '24 10:10 mergify[bot]

@sunby

Are you done with this feature so I can start to review it?

xiaofan-luan avatar Nov 01 '24 00:11 xiaofan-luan

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Nov 04 '24 07:11 mergify[bot]

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

mergify[bot] avatar Nov 04 '24 07:11 mergify[bot]

@sunby cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

mergify[bot] avatar Nov 04 '24 07:11 mergify[bot]

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Nov 05 '24 09:11 mergify[bot]

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

mergify[bot] avatar Nov 05 '24 09:11 mergify[bot]

@sunby cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

mergify[bot] avatar Nov 05 '24 09:11 mergify[bot]

@sunby cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

mergify[bot] avatar Nov 06 '24 04:11 mergify[bot]

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

mergify[bot] avatar Nov 06 '24 04:11 mergify[bot]

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Nov 06 '24 05:11 mergify[bot]

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Nov 06 '24 07:11 mergify[bot]

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

mergify[bot] avatar Nov 06 '24 07:11 mergify[bot]

@sunby cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

mergify[bot] avatar Nov 06 '24 07:11 mergify[bot]

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

mergify[bot] avatar Nov 06 '24 09:11 mergify[bot]

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Nov 06 '24 09:11 mergify[bot]

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

mergify[bot] avatar Nov 06 '24 11:11 mergify[bot]

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Nov 06 '24 13:11 mergify[bot]

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Nov 07 '24 05:11 mergify[bot]

@sunby cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

mergify[bot] avatar Nov 07 '24 06:11 mergify[bot]

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Nov 07 '24 07:11 mergify[bot]

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Nov 08 '24 08:11 mergify[bot]

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Nov 08 '24 11:11 mergify[bot]