spark-rapids icon indicating copy to clipboard operation
spark-rapids copied to clipboard

[QST] Does `spark-rapids` support GPU acceleration for pandas-on-Spark (`pyspark.pandas`)?

Open asddfl opened this issue 1 month ago • 1 comments

What is your question? Does spark-rapids support GPU acceleration for pandas-on-Spark (pyspark.pandas)?

asddfl avatar Dec 08 '25 14:12 asddfl

@asddfl We should support it mostly, but we don't officially test it. pyspark.pandas generally is translated into dataframe operations that are common with the SQL back end. If an operations that we support is used by the pandas translation layer, then we will try to accelerate it. That said, there are a few cases where pyspark.pandas will either use a custom pandas specific operations (asOfJoin in older versions of Spark) or will move the data to python to do processing. We do not really support accelerating anything on the python side, mostly due to memory management issues with the GPU.

revans2 avatar Dec 08 '25 16:12 revans2