modin icon indicating copy to clipboard operation
modin copied to clipboard

Reduce the amount of remote calls when reshuffling partitions

Open dchigarev opened this issue 1 year ago • 1 comments

Currently, the splitting stage of reshuffling generates up to NCORES ^ 2 amount of partitions. There are NCORES amount of row_partitions submitting the splitting kernels each additionally generating NCORES split partitions: https://github.com/modin-project/modin/blob/cdedd710f2ea092f72316600335b9e8e15995d28/modin/core/dataframe/pandas/partitioning/partition_manager.py#L1591-L1599

Operating with such amount of partitions causes Ray to perform badly (we had already hit it at #5394).

It's proposed to reduce the number of splitting kernels by combining row parts into column parts as follows:

num_existing_row_partitions = len(row_partitions)
# print(row_partitions.shape) -> # (X, 1) -> will result into a (X, X) split partitions
ideal_num_new_row_partitions = int(num_existing_row_partitions ** 0.5)

chunk_size = compute_chunksize(
    num_existing_row_partitions, ideal_num_new_row_partitions, min_block_size=1
)
new_row_partitions = np.array(
    [
        cls._column_partitions_class(row_partitions[i : i + chunk_size])
        for i in range(0, num_existing_row_partitions, chunk_size)
    ]
)
# print(new_row_partitions.shape) -> # (X ** 0.5, 1) -> will result into a (X ** 0.5, X) split partitions

Applying the changes gives the following results on my single-node machine:

CPU: Intel(R) Xeon(R) Gold 6238R CPU @ 2.20GHz, 112 threads

case modin + #5780 modin + #5780 + draft impl for this tracker
repro from #5533 3.32s 1.74s
show-case from https://github.com/modin-project/modin/pull/4601#issuecomment-1331544461 with 100 cols 3.75s 4.44s
show-case from https://github.com/modin-project/modin/pull/4601#issuecomment-1331544461 with 32 cols 2.46s 1.84s

As you can see, the diffs vary from 2x speed up to 0.18x slow down, I'm still investigating a correlation to make a heuristic dispatching between these two implementations.

I also understand that performance in the cluster environment may react differently to this change as constructing these column partitions will cause data transferring between nodes which may be expensive. Maybe we can also dispatch between the old and the new logic depending on the IsRayCluster environment variable being set?

cc @modin-project/modin-core @RehanSD

dchigarev avatar Mar 20 '23 15:03 dchigarev

@dchigarev, could you revisit this issue?

YarShev avatar Jan 19 '24 16:01 YarShev