datafusion-comet
datafusion-comet copied to clipboard
Do we have plans to support remote shuffle services, such as Apache Celeborn?
What is the problem the feature request solves?
In Spark, Remote Shuffle Services manage shuffle data in a distributed environment to enhance performance and stability. They store shuffle data externally, reducing executor memory usage, lowering garbage collection overhead, and supporting dynamic scaling, improving fault tolerance for large-scale data processing.
Describe the potential solution
Apache Celeborn
Additional context
No response
I think that it makes sense for Comet to support this, but it is not high on my list of priorities for now. We would welcome contributions in this area though.