[QST] Does `spark-rapids` support GPU acceleration for pandas-on-Spark (`pyspark.pandas`)?
What is your question?
Does spark-rapids support GPU acceleration for pandas-on-Spark (pyspark.pandas)?
@asddfl We should support it mostly, but we don't officially test it. pyspark.pandas generally is translated into dataframe operations that are common with the SQL back end. If an operations that we support is used by the pandas translation layer, then we will try to accelerate it. That said, there are a few cases where pyspark.pandas will either use a custom pandas specific operations (asOfJoin in older versions of Spark) or will move the data to python to do processing. We do not really support accelerating anything on the python side, mostly due to memory management issues with the GPU.