spark-rapids icon indicating copy to clipboard operation
spark-rapids copied to clipboard

[Audit][SPARK-40660][CORE][SQL] Switch to XORShiftRandom to distribute elements

Open abellina opened this issue 3 years ago • 0 comments

This change https://github.com/apache/spark/commit/e6bebb6665 moved to using XORShiftRandom instead of Random(hashing.byteswap32(index)) in a couple of places. The RDD.coalesce one I don't believe affects us, but the change to getPartitionKeyExtractor should potentially be reflected here: https://github.com/NVIDIA/spark-rapids/blob/801a339bb3088131110076d8f3385ad9a57761f5/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuRoundRobinPartitioning.scala#L94

It seems that we don't have the byteswap32 call either (that was previously there in Spark open source)

abellina avatar Oct 17 '22 19:10 abellina