RemoteShuffleService
RemoteShuffleService copied to clipboard
Spark 3.1/3.2 failed sql skew and local reader tests
Hi, I ran the SparkSqlOptimizeSkewedJoinTest and SparkSqlOptimizeLocalShuffleReaderTest using spark3.1 and spark3.2, and both Rss test failed with assertion error with duplicate output rows.
For example, the expected output of SparkSqlOptimizeLocalShuffleReaderTest has 2 records
1 100, 1 101
however, the rss output has 8 records
1 100, 1 100, 1 100, 1 100, 1 101, 1 101, 1 101, 1 101
I also ran with spark 3.0, and the test passed. Wondering if you have any idea why there is such a issue with spark 3.1 and 3.2
Previously RSS was not tested much with Spark 3.1/3.2 and Adaptive Query Execution (AQE). The code looks having bug. Would love to see someone debug further there.
@hiboyang Hi, I fould the bug and fixed it in a pull request