spark-rapids
spark-rapids copied to clipboard
[BUG] Spill occurs in GpuAggregate when GPU batch size reduces
Describe the bug
When running a heavy GpuAggregate consisting of over 400 aggregate functions (including hundreds of comprehensive function stddev_pop ), significant amount of spill is observed in the map stage if using a relative small Gpu batch size (spark.rapids.sql.batchSizeBytes=512MB).
However, spill does not occur with larger Gpu batch size (spark.rapids.sql.batchSizeBytes=2048MB). Accordingly, the execution time is much more shorter: