incubator-uniffle icon indicating copy to clipboard operation
incubator-uniffle copied to clipboard

[#1303] refactor(client): using the configOption in RssTezConfig

Open lifeSo opened this issue 1 year ago • 11 comments

What changes were proposed in this pull request?

Improvement: using the configOption in RssTezConfig #1303

Why are the changes needed?

Fix: https://github.com/apache/incubator-uniffle/issues/1303

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Unit test

lifeSo avatar Jan 01 '24 17:01 lifeSo

Codecov Report

Attention: 79 lines in your changes are missing coverage. Please review.

Comparison is base (15a399a) 53.43% compared to head (79a08b2) 54.88%. Report is 4 commits behind head on master.

Files Patch % Lines
...main/java/org/apache/tez/common/TezClientConf.java 95.57% 11 Missing and 2 partials :warning:
...java/org/apache/uniffle/common/config/RssConf.java 0.00% 13 Missing :warning:
...c/main/java/org/apache/tez/common/RssTezUtils.java 78.57% 5 Missing and 4 partials :warning:
...n/java/org/apache/tez/dag/app/RssDAGAppMaster.java 77.77% 7 Missing and 1 partial :warning:
...library/common/shuffle/impl/RssTezFetcherTask.java 0.00% 8 Missing :warning:
...tez/runtime/library/input/RssUnorderedKVInput.java 0.00% 8 Missing :warning:
.../org/apache/uniffle/common/config/RssBaseConf.java 0.00% 5 Missing :warning:
...rg/apache/tez/dag/app/TezRemoteShuffleManager.java 60.00% 3 Missing and 1 partial :warning:
...rary/common/shuffle/orderedgrouped/RssShuffle.java 50.00% 4 Missing :warning:
...on/shuffle/orderedgrouped/RssShuffleScheduler.java 72.72% 3 Missing :warning:
... and 2 more
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #1413      +/-   ##
============================================
+ Coverage     53.43%   54.88%   +1.45%     
- Complexity     2723     2737      +14     
============================================
  Files           419      402      -17     
  Lines         23992    22015    -1977     
  Branches       2046     2063      +17     
============================================
- Hits          12820    12083     -737     
+ Misses        10378     9201    -1177     
+ Partials        794      731      -63     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Jan 01 '24 18:01 codecov-commenter

I like this PR that refactor the client config. Could you help fix the failed test cases

zuston avatar Jan 02 '24 10:01 zuston

Could you help review this? @zhengchenyu

zuston avatar Jan 02 '24 10:01 zuston

I like this PR that refactor the client config. Could you help fix the failed test cases

Ok, I will fix the failed test cases recently.

lifeSo avatar Jan 06 '24 05:01 lifeSo

ping @zhengchenyu

jerqi avatar Jan 08 '24 07:01 jerqi

It is a great PR! The code about config will be cleaner! But I've always had a question. There are a lot of transitions from Configuration to RssConf, then from RssConf to Configuration. It seems strange. In this PR, I think @lifeSo used cache Configuration in TezClientConf, it is more efficient, but additional memory is required. @zuston What do you think of this approach?

zhengchenyu avatar Jan 09 '24 02:01 zhengchenyu

In this PR, I think @lifeSo used cache Configuration in TezClientConf, it is more efficient, but additional memory is required.

LGTM. It looks the memory will not consume much more. If I'm wrong, feel free to correct.

zuston avatar Jan 09 '24 03:01 zuston

I summarized the usage of TezClientConf, and why need the transitions:

  • GetShuffleServerRequest & GetShuffleServerResponse & IdUtils & InputContextUtils no nedd conf

  • RssDAGAppMaster need TezClientConf to get arguement and transitions.

  • TezRemoteShuffleManager need TezClientConf and configuration ,to get arguement and transitions.

  • RssTezUtils need RssTezUtils

  • ShuffleAssignmentsInfoWritable no need conf

  • TezClassLoader no need conf

  • UmbilicalUtils need configuration to pass arguement

  • RssTezAMPolicyProvider no conf

  • RemoteFetchedInput no conf

  • RssShuffleManager no need TezClientConf to get conf but transitions is needed.

  • RssSimpleFetchedInputAllocator need configuration and TezClientConf to get some conf

  • RssTezFetcher no need configuration and TezClientConf

  • RssTezFetcherTask need TezClientConf to get argue,also need Configuration to get transitions.

  • RssInMemoryMerger & RssMergeManager & RssShuffle & RssShuffleScheduler need configuration to call other Tez class, no need TezClientConf

  • RssTezBypassWriter & RssTezShuffleDataFetcher & WriteBuffer

  • WriteBufferManager no need , just pass RssConf.

  • RssSorter & RssUnSorter need Configuration and use only in super()

  • RssTezPerPartitionRecord no conf

  • RssOrderedPartitionedKVOutput & RssUnorderedKVOutput & RssUnorderedPartitionedKVOutput no nedd TezClientConf.

lifeSo avatar Jan 09 '24 19:01 lifeSo

cc @zhengchenyu I'm not familiar with Tez code.

zuston avatar Jan 12 '24 02:01 zuston

cc @zhengchenyu I'm not familiar with Tez code.

This PR does not involve the core logic of Tez. This PR is only about how the MR/Tez/Spark client uses configuration. I'm still a little confused about how to design the client config.

zhengchenyu avatar Jan 12 '24 02:01 zhengchenyu

Emm... I take a look about current impl, I don't think we should take the tez prefix into the config key, you can refer the spark's RssClientConf .

In spark's partial configOption, we just extract the spark's conf like spark.rss.client.type into RssConf, that means we could share some general configOption defined in the RssClientConf. So you just only create the TezClientConfig.java, which could define some tez specified configOptions, shareable configs could use defined in RssClientConf. cc @zhengchenyu

@zuston I agree with you! If so, RssClientConf should be extended from RssClientConf (But for now, RssClientConf is extended from RssBaseConf ), and does not take the tez prefix into the config key. I think the tez prefix should be considered in the TezConfiguration. @lifeSo what about you?

zhengchenyu avatar Jan 12 '24 06:01 zhengchenyu