spark-rapids
spark-rapids copied to clipboard
[Audit][SPARK-40660][CORE][SQL] Switch to XORShiftRandom to distribute elements
This change https://github.com/apache/spark/commit/e6bebb6665 moved to using XORShiftRandom instead of Random(hashing.byteswap32(index)) in a couple of places. The RDD.coalesce one I don't believe affects us, but the change to getPartitionKeyExtractor should potentially be reflected here: https://github.com/NVIDIA/spark-rapids/blob/801a339bb3088131110076d8f3385ad9a57761f5/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuRoundRobinPartitioning.scala#L94
It seems that we don't have the byteswap32 call either (that was previously there in Spark open source)