spark-rapids icon indicating copy to clipboard operation
spark-rapids copied to clipboard

[BUG] Spill occurs in GpuAggregate when GPU batch size reduces

Open sperlingxx opened this issue 1 month ago • 4 comments

Describe the bug When running a heavy GpuAggregate consisting of over 400 aggregate functions (including hundreds of comprehensive function stddev_pop ), significant amount of spill is observed in the map stage if using a relative small Gpu batch size (spark.rapids.sql.batchSizeBytes=512MB).

Image Image

However, spill does not occur with larger Gpu batch size (spark.rapids.sql.batchSizeBytes=2048MB). Accordingly, the execution time is much more shorter:

Image Image

sperlingxx avatar Nov 26 '25 02:11 sperlingxx