[Feature] Support regexp_split function
Why I'm doing:
What I'm doing:
Support split_by_regexp function, compatible with CK's split_by_regexp and Spark's split function
Fix https://github.com/StarRocks/starrocks/issues/37089
Which issues of this PR fixes :
Partially completes regexp_split function in: https://github.com/StarRocks/starrocks/issues/37089
Another pr will be submitted to convert regexp_split to split_by_regexp function when it is in Trino Mode after this pr being merged.
What type of PR is this:
- [ ] BugFix
- [x] Feature
- [ ] Enhancement
- [ ] Refactor
- [ ] UT
- [ ] Doc
- [ ] Tool
Does this PR entail a change in behavior?
- [ ] Yes, this PR will result in a change in behavior.
- [x] No, this PR will not result in a change in behavior.
If yes, please specify the type of change:
- [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
- [ ] Parameter changes: default values, similar parameters but with different default values
- [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
- [ ] Feature removed
- [ ] Miscellaneous: upgrade & downgrade compatibility, etc.
Checklist:
- [x] I have added test cases for my bug fix or my new feature
- [x] This pr needs user documentation (for new or modified features or behaviors)
- [ ] I have added documentation for my new feature or new function
- [ ] This is a backport pr
Bugfix cherry-pick branch check:
- [x] I have checked the version labels which the pr will be auto-backported to the target branch
- [x] 3.3
- [x] 3.2
- [x] 3.1
- [x] 3.0
- [ ] 2.5
- could you pls add a document? thx!
- and for the function's name, starrocks has: regexp, regexp_extract, regexp_extract_all, regexp_replace, so I think this function is better named regexp_split.
- also, is this function doing the same job as trino's regexp_split? is there any further transformation need to be compatible with trino in trino's dialect?
Quality Gate passed
Issues
0 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
[FE Incremental Coverage Report]
:white_check_mark: pass : 0 / 0 (0%)
[BE Incremental Coverage Report]
:white_check_mark: pass : 159 / 179 (88.83%)
file detail
| path | covered_line | new_line | coverage | not_covered_line_detail | |
|---|---|---|---|---|---|
| :large_blue_circle: | be/src/exprs/regexp_split.cpp | 38 | 50 | 76.00% | [41, 42, 70, 74, 76, 77, 78, 79, 80, 84, 85, 86] |
| :large_blue_circle: | be/src/exprs/string_functions.cpp | 120 | 128 | 93.75% | [3782, 3793, 3808, 3842, 3884, 3885, 3886, 3887] |
| :large_blue_circle: | be/src/exprs/regexp_split.h | 1 | 1 | 100.00% | [] |
- could you pls add a document? thx!
- and for the function's name, starrocks has: regexp, regexp_extract, regexp_extract_all, regexp_replace, so I think this function is better named regexp_split.
- also, is this function doing the same job as trino's regexp_split? is there any further transformation need to be compatible with trino in trino's dialect?
@wangsimo0 done. The function should be compatible with Trino's regexp_split and no further transformation is required.