datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Implementation for regex_instr

Open nirnayroy opened this issue 7 months ago • 9 comments

Which issue does this PR close?

  • Closes #13009

Rationale for this change

Implements a regex SQL standard function in datafusion

What changes are included in this PR?

Implementation, tests, benches and docs for the regexp_instr function

Are these changes tested?

Yes

Are there any user-facing changes?

Yes

No

nirnayroy avatar May 02 '25 20:05 nirnayroy

Thank you. I'm wondering what's the reference system for this function's behavior (like postgres or others)

2010YOUY01 avatar May 04 '25 03:05 2010YOUY01

Thank you. I'm wondering what's the reference system for this function's behavior (like postgres or others)

The reference system for this function's behaviour is postgres.

nirnayroy avatar May 04 '25 06:05 nirnayroy

Thank you for this PR @nirnayroy

Can you please resolve the CI error: https://github.com/apache/datafusion/actions/runs/14820525339/job/41754009017?pr=15928

If you encounter an error, run './dev/update_function_docs.sh' and commit

alamb avatar May 07 '25 20:05 alamb

Thank you for this PR @nirnayroy

Can you please resolve the CI error: https://github.com/apache/datafusion/actions/runs/14820525339/job/41754009017?pr=15928

If you encounter an error, run './dev/update_function_docs.sh' and commit

I ran the bash script, but I’m not sure if the workflow succeeded.

nirnayroy avatar May 19 '25 12:05 nirnayroy

fixed the cippy errors showing up in the workflow

nirnayroy avatar May 19 '25 21:05 nirnayroy

fixed formatting error in workflow

nirnayroy avatar May 20 '25 05:05 nirnayroy

@Omega359 I wonder if you might have time to review this PR?

alamb avatar May 21 '25 18:05 alamb

Of course @alamb, not sure how I missed this one. It may be a day or two though

Omega359 avatar May 21 '25 19:05 Omega359

Hi @blaginin, thanks for the review and regret the delay in reply. I think I have rectified a majority of the concerns raised. Please have a look again.

nirnayroy avatar Jun 12 '25 07:06 nirnayroy

Hi @blaginin , thanks for the help and suggestions for improvement. I have addressed the requested changes. Please have another look.

Tests are failing. If that helps, you can run the CI command locally to debug:

I have tried running it and the tests are passing on my local.

nirnayroy avatar Jun 28 '25 16:06 nirnayroy

I'll pull this branch later this week and run the tests but in general this PR is looking pretty good! I left a few comments/suggestions for a few things I found from a quick review.

Omega359 avatar Jul 01 '25 20:07 Omega359

Run extended tests

Omega359 avatar Jul 02 '25 19:07 Omega359

Run extended tests

It works!

alamb avatar Jul 02 '25 19:07 alamb

Clippy failures related to rand update (I think https://github.com/apache/datafusion/pull/16062)

Edit: looks like the usages of rand for the benchmark was updated in the above commit ... I'm thinking the additions in this PR do not reflect that change.

Omega359 avatar Jul 02 '25 20:07 Omega359

Run extended tests

Omega359 avatar Jul 02 '25 20:07 Omega359

Run extended tests

Omega359 avatar Jul 03 '25 14:07 Omega359

LGTM :) @blaginin

Omega359 avatar Jul 03 '25 17:07 Omega359

@nirnayroy thank you so much for your work, i think that's a very useful new function šŸŽ‰ welcome to the project!!!

blaginin avatar Jul 03 '25 23:07 blaginin

šŸš€

alamb avatar Jul 05 '25 14:07 alamb