spark-rapids icon indicating copy to clipboard operation
spark-rapids copied to clipboard

WIP: Spark 4: Fix miscellaneous tests including logic, repart, hive_delimited.

Open mythrocks opened this issue 1 year ago • 1 comments

(Partially) fixes #11031.

This PR addresses tests that fail on Spark 4.0 in the following files:

  1. integration_tests/src/main/python/datasourcev2_read_test.py
  2. integration_tests/src/main/python/expand_exec_test.py
  3. integration_tests/src/main/python/get_json_test.py
  4. integration_tests/src/main/python/hive_delimited_text_test.py
  5. integration_tests/src/main/python/logic_test.py
  6. integration_tests/src/main/python/repart_test.py
  7. integration_tests/src/main/python/time_window_test.py

mythrocks avatar Jul 02 '24 23:07 mythrocks

Still a work in progress. A couple of other tests to be addressed.

mythrocks avatar Jul 02 '24 23:07 mythrocks

Build

mythrocks avatar Jul 08 '24 21:07 mythrocks

Build

mythrocks avatar Jul 08 '24 22:07 mythrocks

That last failure was an interesting one to track down.

Time interval calculations on Spark < 3.3 involve multiplication/division aggregation operations. These tend to fall off the GPU in ANSI mode because of #5114. This test is guaranteed to fail, because part of the plan is off the GPU.

For Spark >= 3.3, the same calculations seem to involve modulo operations that don't seem susceptible to ANSI-mode failures.

I've included a skip for this test with ANSI enabled, on Spark < 3.3. This can be rolled back once #5114 is addressed.

mythrocks avatar Jul 09 '24 22:07 mythrocks

Build

mythrocks avatar Jul 09 '24 22:07 mythrocks

Build

mythrocks avatar Jul 11 '24 18:07 mythrocks

@NVnavkumar, I was wondering if you might take another look at this one.

mythrocks avatar Jul 15 '24 17:07 mythrocks

Build

mythrocks avatar Jul 16 '24 17:07 mythrocks

There seems to be an error on Spark 3.3, where the expected exception isn't thrown. It's taking a bit of time to repro. I'll update here once I have something.

mythrocks avatar Jul 16 '24 21:07 mythrocks

I think I've addressed the Databricks failure. I'll kick off another build, and request the reviewers for another round.

mythrocks avatar Jul 17 '24 23:07 mythrocks

Build

mythrocks avatar Jul 17 '24 23:07 mythrocks

@NVnavkumar, I've fixed the last nit. Does this look agreeable?

mythrocks avatar Jul 18 '24 15:07 mythrocks

Thank you for reviewing, @NVnavkumar. This change has now been merged.

mythrocks avatar Jul 18 '24 22:07 mythrocks