gazelle_plugin
gazelle_plugin copied to clipboard
[SHUFFLE] Solve shuffle's small files issue
Is your feature request related to a problem or challenge? Please describe what you are trying to do. currently during spill each reducer has a file, it's not scaling. The same problem as BypassMergeSortShuffleWriter
Describe the solution you'd like use the samilar way as UnsafeShuffleWriter
- preferspill = False
- Cache split batches in memory
- Spill all partitions only if there isn't enough memory or spill called by other operator (currently only largest reducer's batch is spilled)
- Spill all partitions in the same file, by partition id order (currently spill each reducer into one file)
- Merge the spills into single large file, by partition id order, use mmap instead of read/seek/write (currently use read/write) Output is the same
mmap shows worse performance than read/write.
spill | write | |
mmap | 2.23 | 4.3 |
read/write | 0.84 | 1.49 |
Looks like It's because the difference of page fault handling. write doesn't cause major page fault while ftruncate/fallocate + mmap caused major page fault