datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

Do we have plans to support remote shuffle services, such as Apache Celeborn?

Open dpengpeng opened this issue 9 months ago • 1 comments

What is the problem the feature request solves?

In Spark, Remote Shuffle Services manage shuffle data in a distributed environment to enhance performance and stability. They store shuffle data externally, reducing executor memory usage, lowering garbage collection overhead, and supporting dynamic scaling, improving fault tolerance for large-scale data processing.

Describe the potential solution

Apache Celeborn

Additional context

No response

dpengpeng avatar Mar 13 '25 03:03 dpengpeng

I think that it makes sense for Comet to support this, but it is not high on my list of priorities for now. We would welcome contributions in this area though.

andygrove avatar Mar 13 '25 15:03 andygrove