MithunR

Results 156 comments of MithunR

I've moved this over to 24.08, rather than rush this to 24.06. There is a parallel effort on to explore KMP or Aho-Corasick, to see if that fits better.

I've been unable to spend time on this. I'm closing this for now. I'll try hit this again later, with the modified KMP approach.

I have raised https://github.com/NVIDIA/spark-rapids-jni/issues/2029. To me, it looks like a bug in how percentiles are derived from the constructed histograms.

Odd: The test runs locally, but fails in CI. Investigating... Edit: Here's the complaint in the failure: ``` 2024-07-01T22:49:45.4593611Z [2024-07-01T22:48:19.060Z] 2024-07-01 22:33:46 INFO Running test 'src/main/python/subquery_test.py::test_scalar_subquery_array[Null-True][DATAGEN_SEED=1719870311, TZ=UTC, IGNORE_ORDER({'local': True})]' ```...

I'm investigating the test failures. It seems that the generated input isn't causing an exception in ANSI mode, on Databricks.