spark-rapids
spark-rapids copied to clipboard
Support columnar processing for mapInArrow[databricks]
closes https://github.com/NVIDIA/spark-rapids/issues/6313
This PR adds the columnar support for the new API mapInArrow which is introduced in Spark 3.3.0.
Performance
- About 6.8 GB Parquet data in local files.
- CPU 12 cores, and one GPU (Titan V, with 12GB memory)
| CPU Read + CPU mapInArrow | GPU Read + CPU mapInArrow | GPU Read + GPU mapInArrow |
|---|---|---|
| 97.20 | 91.36 | 81.67 |
Signed-off-by: Liangcai Li [email protected]
build
build
build
build
build
build
The latest failure is related to https://github.com/NVIDIA/spark-rapids/issues/6869
build