incubator-uniffle icon indicating copy to clipboard operation
incubator-uniffle copied to clipboard

[#731][FOLLOWUP] feat(Spark): Configure blockIdLayout for Spark based on max partitions

Open EnricoMi opened this issue 1 year ago • 1 comments

What changes were proposed in this pull request?

The configuration of the block id layout for Spark2 and Spark3 can be simplified by only providing the maximal number of partitions.

Why are the changes needed?

Currently, optimally configuring the block id layout for Spark is quite complex: https://github.com/apache/incubator-uniffle/pull/1528/files#diff-09ce7eaa98815d62ca00b2a8b0a45b0a922b047014c1f91dc17081b3fef8e7a8R106-R112

Three values have to be provided: the number of bits for sequence number, partition id and task attempt id. The task attempt id has to provide more bits than the partition id, in the sense that the maximal number of attempts can be stored. This requires to also account for speculative execution.

The RssShuffleManager can do the computation and derive the optimal block id layout configuration from the maximal number of partitions only.

Does this PR introduce any user-facing change?

Adds optional configuration spark.rss.blockId.maxPartitions.

How was this patch tested?

Unit and integration tests.

EnricoMi avatar Mar 07 '24 12:03 EnricoMi

Test Results

 2 326 files  ±  0   2 326 suites  ±0   4h 29m 14s :stopwatch: -12s    902 tests + 51     900 :white_check_mark: + 51   2 :zzz: ±0  0 :x: ±0  10 477 runs  +459  10 454 :white_check_mark: +459  23 :zzz: ±0  0 :x: ±0 

Results for commit 1aa1172d. ± Comparison against base commit d994b272.

:recycle: This comment has been updated with latest results.

github-actions[bot] avatar Mar 07 '24 13:03 github-actions[bot]

I will review in this weekend.

jerqi avatar Mar 08 '24 06:03 jerqi

Codecov Report

Attention: Patch coverage is 89.32039% with 11 lines in your changes are missing coverage. Please review.

Project coverage is 54.83%. Comparing base (d994b27) to head (1aa1172).

Files Patch % Lines
...pache/uniffle/server/ShuffleServerGrpcService.java 0.00% 10 Missing :warning:
...uniffle/shuffle/manager/RssShuffleManagerBase.java 98.87% 0 Missing and 1 partial :warning:
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #1566      +/-   ##
============================================
+ Coverage     53.72%   54.83%   +1.11%     
- Complexity     2838     2855      +17     
============================================
  Files           437      417      -20     
  Lines         24663    22387    -2276     
  Branches       2094     2103       +9     
============================================
- Hits          13250    12276     -974     
+ Misses        10581     9349    -1232     
+ Partials        832      762      -70     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Mar 08 '24 06:03 codecov-commenter