pash icon indicating copy to clipboard operation
pash copied to clipboard

Round split work balancing (not just round-robin)

Open tammam1998 opened this issue 3 years ago • 2 comments

Right now r_split divides the file into blocks and the blocks go in order into each of the files. For example for two files and 6 blocks it will look like this file1: b0, b2, b4, file2: b1, b3, b5. The block orderings don't change after that.

We can improve this by adding a modified eager/dgsh-tee that reads from all the blocks outputted by r_split and distribute the blocks to the next available r_wrap or r_unwrap worker (using select/poll). For this to work perfectly we will need to insert an eager block reader at every level of the parallel tree. We would also need to change r_merge to allow for blocks that aren't as finely ordered as before (example file1: b0, b3, b5, file2: b1, b2, b4).

This is experimental but it could lead to some considerable improvements in balancing the workload especially for commands that could have varying runtimes depending on the content. Also since block sizes changes after every r_wrap, some blocks might change more than others and as a result, become more time-consuming down the pipeline. Eagerly reading blocks and distributing them would make sure that the work is always balanced and all nodes are doing maximum work.

Once this is implemented we can further improve work balancing by adding gaps between the blocks (blocks with size 0). A conservative gap size would be the number of r_wrap nodes between r_split and the merging point. We can then resplit the blocks using the eagers by utilizing the gaps. Another way of achieving this would be for the eager block reader to split big blocks once they arrive and modify the id number of the following blocks (this could better as it is more dynamic)

tammam1998 avatar Mar 25 '21 16:03 tammam1998

block_eager (1)

The official name of these blocks is shufflers and they can be separated from eagers.

tammam1998 avatar Mar 26 '21 19:03 tammam1998

@tammam1998 is this still relevant? If so, let's mark it as an enhancement!

angelhof avatar Jul 21 '22 17:07 angelhof