milvus icon indicating copy to clipboard operation
milvus copied to clipboard

enhance: add char group tokenizer

Open aoiasd opened this issue 5 months ago • 2 comments

relate: https://github.com/milvus-io/milvus/issues/42792 Add char group tokenizer which support use costum char group or use some build-in char group as delimiters.

aoiasd avatar Jun 16 '25 12:06 aoiasd

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 78.89%. Comparing base (7b8bf63) to head (39a05b9). :warning: Report is 11 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #42793      +/-   ##
==========================================
+ Coverage   78.85%   78.89%   +0.03%     
==========================================
  Files        1568     1568              
  Lines      225854   225854              
==========================================
+ Hits       178102   178179      +77     
+ Misses      41263    41196      -67     
+ Partials     6489     6479      -10     
Components Coverage Δ
Client 79.47% <ø> (ø)
Core 73.90% <ø> (+0.01%) :arrow_up:
Go 79.83% <ø> (+0.03%) :arrow_up:
see 33 files with indirect coverage changes
:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Jun 16 '25 14:06 codecov[bot]

Would you give some introduction for char group tokenizers with some simple examples?

SpadeA-Tang avatar Jun 25 '25 12:06 SpadeA-Tang

@aoiasd cpu-e2e job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Jul 10 '25 12:07 mergify[bot]

@aoiasd go-sdk check failed, comment rerun go-sdk can trigger the job again.

mergify[bot] avatar Jul 14 '25 05:07 mergify[bot]

@aoiasd cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

mergify[bot] avatar Jul 14 '25 08:07 mergify[bot]

rerun go-sdk

aoiasd avatar Jul 15 '25 03:07 aoiasd

@aoiasd cpu-e2e job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Jul 15 '25 03:07 mergify[bot]

@aoiasd go-sdk check failed, comment rerun go-sdk can trigger the job again.

mergify[bot] avatar Jul 15 '25 03:07 mergify[bot]

@aoiasd cpu-e2e job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Jul 15 '25 13:07 mergify[bot]

/approve

zhengbuqian avatar Jul 20 '25 15:07 zhengbuqian

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aoiasd, SpadeA-Tang, zhengbuqian

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

sre-ci-robot avatar Jul 20 '25 15:07 sre-ci-robot

@aoiasd cpu-e2e job failed, comment /run-cpu-e2e can trigger the job again.

mergify[bot] avatar Jul 24 '25 04:07 mergify[bot]

@aoiasd go-sdk check failed, comment rerun go-sdk can trigger the job again.

mergify[bot] avatar Jul 24 '25 06:07 mergify[bot]

/lgtm

zhengbuqian avatar Jul 29 '25 03:07 zhengbuqian