spark-rapids
spark-rapids copied to clipboard
Follow on from recent regexp fixes to reject patterns that cuDF no longer rejects
Closes https://github.com/NVIDIA/spark-rapids/issues/6518
This is a follow on from https://github.com/NVIDIA/spark-rapids/pull/6548 to confirm that we should still be rejecting certain regexp patterns even though cuDF no longer rejects them. We reject them because cuDF either hangs or produces results that are inconsistent with Java.
For the patterns that we reject with End of line/string anchor is not supported in this context, I didn't spend time trying to determine if we could transpile differently to try and support these patterns because it seems like an edge case to me that we are unlikely to encounter, so I suggest we only spend time on this if someone asks us to support it.