starrocks
starrocks copied to clipboard
[Enhancement] Support tokenize function
Why I'm doing:
The different results of tokenization provided by various tokenizers are too vague to users, so we need a tokenize function to allow users to figure it out easily.
What I'm doing:
Support a tokenize function, like tokenize(<tokenizer_name>, <content>)
Fixes #45145
What type of PR is this:
- [ ] BugFix
- [ ] Feature
- [x] Enhancement
- [ ] Refactor
- [ ] UT
- [ ] Doc
- [ ] Tool
Does this PR entail a change in behavior?
- [ ] Yes, this PR will result in a change in behavior.
- [x] No, this PR will not result in a change in behavior.
If yes, please specify the type of change:
- [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
- [ ] Parameter changes: default values, similar parameters but with different default values
- [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
- [ ] Feature removed
- [ ] Miscellaneous: upgrade & downgrade compatibility, etc.
Checklist:
- [x] I have added test cases for my bug fix or my new feature
- [ ] This pr needs user documentation (for new or modified features or behaviors)
- [ ] I have added documentation for my new feature or new function
- [ ] This is a backport pr
Bugfix cherry-pick branch check:
- [x] I have checked the version labels which the pr will be auto-backported to the target branch
- [ ] 3.3
- [ ] 3.2
- [ ] 3.1
- [ ] 3.0
- [ ] 2.5
@dujijun007 thank you for the contribution, could you create an issue to describe this new function? About its interface, input, output and limits?
@dujijun007 thank you for the contribution, could you create an issue to describe this new function? About its interface, input, output and limits?
@imay ok, link it here(#45145)
Quality Gate passed
Issues
0 New issues
0 Accepted issues
Measures
0 Security Hotspots
No data about Coverage
No data about Duplication
[FE Incremental Coverage Report]
:white_check_mark: pass : 0 / 0 (0%)
[BE Incremental Coverage Report]
:white_check_mark: pass : 55 / 57 (96.49%)
file detail
path | covered_line | new_line | coverage | not_covered_line_detail | |
---|---|---|---|---|---|
:large_blue_circle: | be/src/exprs/gin_functions.cpp | 55 | 57 | 96.49% | [71, 91] |