gazelle_plugin icon indicating copy to clipboard operation
gazelle_plugin copied to clipboard

[SHUFFLE] Solve shuffle's small files issue

Open FelixYBW opened this issue 2 years ago • 1 comments

Is your feature request related to a problem or challenge? Please describe what you are trying to do. currently during spill each reducer has a file, it's not scaling. The same problem as BypassMergeSortShuffleWriter

Describe the solution you'd like use the samilar way as UnsafeShuffleWriter

  1. preferspill = False
  2. Cache split batches in memory
  3. Spill all partitions only if there isn't enough memory or spill called by other operator (currently only largest reducer's batch is spilled)
  4. Spill all partitions in the same file, by partition id order (currently spill each reducer into one file)
  5. Merge the spills into single large file, by partition id order, use mmap instead of read/seek/write (currently use read/write) Output is the same

FelixYBW avatar Apr 12 '22 00:04 FelixYBW

mmap shows worse performance than read/write.

spill write
mmap 2.23 4.3
read/write 0.84 1.49

Looks like It's because the difference of page fault handling. write doesn't cause major page fault while ftruncate/fallocate + mmap caused major page fault

FelixYBW avatar May 30 '22 16:05 FelixYBW