datafusion-ballista icon indicating copy to clipboard operation
datafusion-ballista copied to clipboard

Enable/configure shuffle compression

Open Dandandan opened this issue 3 years ago • 1 comments

Is your feature request related to a problem or challenge? Please describe what you are trying to do. Currently arrow-ballista doesn't compress the shuffle files. Compressing the files / data using lz4 / zstd will reduce IO (at the cost of some extra CPU).

Describe the solution you'd like Enable compression for shuffle files/streams.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

Dandandan avatar Dec 22 '22 08:12 Dandandan

We can not control the file writer yet - this depends on https://github.com/apache/arrow-datafusion/issues/4708

Dandandan avatar Dec 22 '22 09:12 Dandandan