milvus icon indicating copy to clipboard operation
milvus copied to clipboard

feat: Support stats task to sort segment by PK

Open xiaocai2333 opened this issue 1 year ago • 27 comments

issue: #33744

This PR includes the following changes:

  1. Added a new task type to the task scheduler in datacoord: stats task, which sorts segments by primary key.
  2. Implemented segment sorting in indexnode.
  3. Added a new field FieldStatsLog to SegmentInfo to store token index information.

xiaocai2333 avatar Jul 29 '24 03:07 xiaocai2333

@xiaocai2333

Invalid PR Title Format Detected

Your PR submission does not adhere to our required standards. To ensure clarity and consistency, please meet the following criteria:

  1. Title Format: The PR title must begin with one of these prefixes:
  • feat: for introducing a new feature.
  • fix: for bug fixes.
  • enhance: for improvements to existing functionality.
  • test: for add tests to existing functionality.
  • doc: for modifying documentation.
  • auto: for the pull request from bot.
  1. Description Requirement: The PR must include a non-empty description, detailing the changes and their impact.

Required Title Structure:

[Type]: [Description of the PR]

Where Type is one of feat, fix, enhance, test or doc.

Example:

enhance: improve search performance significantly 

Please review and update your PR to comply with these guidelines.

mergify[bot] avatar Jul 29 '24 03:07 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Jul 29 '24 03:07 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Jul 30 '24 10:07 mergify[bot]

rerun ut

xiaocai2333 avatar Aug 12 '24 09:08 xiaocai2333

Codecov Report

Attention: Patch coverage is 82.77842% with 300 lines in your changes missing coverage. Please review.

Project coverage is 81.65%. Comparing base (311f860) to head (974365b). Report is 8 commits behind head on master.

Files with missing lines Patch % Lines
internal/indexnode/task_stats.go 77.18% 59 Missing and 22 partials :warning:
internal/datacoord/garbage_collector.go 30.61% 30 Missing and 4 partials :warning:
internal/core/src/segcore/SegmentSealedImpl.cpp 77.20% 31 Missing :warning:
internal/datacoord/task_stats.go 88.62% 25 Missing and 4 partials :warning:
internal/indexnode/indexnode_service.go 79.66% 21 Missing and 3 partials :warning:
internal/datanode/compaction/merge_sort.go 72.83% 14 Missing and 8 partials :warning:
internal/datacoord/stats_task_meta.go 91.52% 13 Missing and 2 partials :warning:
...datanode/compaction/segment_reader_from_binlogs.go 67.50% 10 Missing and 3 partials :warning:
internal/datacoord/meta.go 83.01% 6 Missing and 3 partials :warning:
internal/datacoord/task_scheduler.go 90.58% 8 Missing :warning:
... and 11 more
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #35054      +/-   ##
==========================================
+ Coverage   79.84%   81.65%   +1.80%     
==========================================
  Files        1239     1256      +17     
  Lines      148031   150540    +2509     
==========================================
+ Hits       118194   122922    +4728     
+ Misses      25005    22730    -2275     
- Partials     4832     4888      +56     
Files with missing lines Coverage Δ
internal/core/src/segcore/SegmentSealedImpl.h 52.38% <100.00%> (-4.15%) :arrow_down:
internal/core/src/segcore/segment_c.cpp 71.06% <100.00%> (+0.32%) :arrow_up:
internal/datacoord/analyze_meta.go 100.00% <100.00%> (ø)
internal/datacoord/compaction.go 74.10% <100.00%> (+9.83%) :arrow_up:
internal/datacoord/compaction_task.go 90.00% <100.00%> (+1.11%) :arrow_up:
internal/datacoord/compaction_task_l0.go 100.00% <100.00%> (+4.70%) :arrow_up:
internal/datacoord/compaction_task_mix.go 64.40% <100.00%> (+7.38%) :arrow_up:
internal/datacoord/compaction_trigger.go 84.96% <100.00%> (+2.57%) :arrow_up:
internal/datacoord/index_meta.go 95.64% <100.00%> (+0.34%) :arrow_up:
internal/datacoord/server.go 74.37% <100.00%> (+5.22%) :arrow_up:
... and 43 more

... and 213 files with indirect coverage changes

codecov[bot] avatar Aug 13 '24 04:08 codecov[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 19 '24 08:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 21 '24 13:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 22 '24 10:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 22 '24 12:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 23 '24 02:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 23 '24 06:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 23 '24 11:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 26 '24 02:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 26 '24 04:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 27 '24 08:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 27 '24 09:08 mergify[bot]

@xiaocai2333 Thanks for your contribution. Please submit with DCO, see the contributing guide https://github.com/milvus-io/milvus/blob/master/CONTRIBUTING.md#developer-certificate-of-origin-dco.

mergify[bot] avatar Aug 27 '24 12:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 27 '24 12:08 mergify[bot]

/run-cpu-e2e

xiaocai2333 avatar Aug 27 '24 12:08 xiaocai2333

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 27 '24 13:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 28 '24 07:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 28 '24 09:08 mergify[bot]

/run-cpu-e2e

xiaocai2333 avatar Aug 28 '24 09:08 xiaocai2333

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 28 '24 09:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 28 '24 13:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 29 '24 08:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 29 '24 10:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 29 '24 13:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 30 '24 03:08 mergify[bot]

@xiaocai2333 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Aug 30 '24 07:08 mergify[bot]