datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Improve performance of `regexp_count`

Open alamb opened this issue 1 year ago • 0 comments

Is your feature request related to a problem or challenge?

@Omega359 and @xinlifoobar added the regexp_count function in https://github.com/apache/datafusion/pull/12970

However, regexp_count seems to be significantly slower than regexp_match and regexp_like. (see https://github.com/apache/datafusion/pull/12080#issuecomment-2316755718 for a benchmark)

Describe the solution you'd like

I would like to review the benchmark for regexp_count and figure out why it is slower and if so try to improve it

Describe alternatives you've considered

No response

Additional context

No response

alamb avatar Oct 18 '24 20:10 alamb