incubator-uniffle icon indicating copy to clipboard operation
incubator-uniffle copied to clipboard

[#1881] fix(client-spark): Make spark client uniifle class under org.apache.uniffle package to avoid conflict

Open maobaolong opened this issue 1 year ago • 7 comments

What changes were proposed in this pull request?

Make spark client uniifle class under org.apache.uniffle package to avoid conflict

Why are the changes needed?

(Please clarify why the changes are needed. For instance,

  1. If you propose a new API, clarify the use case for a new API.
  2. If you fix a bug, describe the bug.)

Fix: #1881

Does this PR introduce any user-facing change?

No.

How was this patch tested?

  • Tests on local
  • Tests on our test cluster

maobaolong avatar Jul 10 '24 02:07 maobaolong

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 53.16%. Comparing base (704c70c) to head (0ee9803). Report is 5 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master    #1882      +/-   ##
============================================
- Coverage     53.35%   53.16%   -0.20%     
- Complexity     2846     2991     +145     
============================================
  Files           428      447      +19     
  Lines         22805    24435    +1630     
  Branches       2142     2273     +131     
============================================
+ Hits          12167    12990     +823     
- Misses         9861    10646     +785     
- Partials        777      799      +22     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Jul 10 '24 02:07 codecov-commenter

Test Results

 2 657 files  ±0   2 657 suites  ±0   5h 31m 12s :stopwatch: +4s    946 tests ±0     945 :white_check_mark: ±0   1 :zzz: ±0  0 :x: ±0  11 799 runs  ±0  11 784 :white_check_mark: ±0  15 :zzz: ±0  0 :x: ±0 

Results for commit 28fe7d7e. ± Comparison against base commit c48af78e.

This pull request removes 63 and adds 63 tests. Note that renamed tests count towards both.
org.apache.spark.shuffle.DelegationRssShuffleManagerTest ‑ testCreateFallback
org.apache.spark.shuffle.DelegationRssShuffleManagerTest ‑ testCreateInDriver
org.apache.spark.shuffle.DelegationRssShuffleManagerTest ‑ testCreateInDriverDenied
org.apache.spark.shuffle.DelegationRssShuffleManagerTest ‑ testCreateInExecutor
org.apache.spark.shuffle.DelegationRssShuffleManagerTest ‑ testTryAccessCluster
org.apache.spark.shuffle.FunctionUtilsTests ‑ testOnceFunction0
org.apache.spark.shuffle.RssShuffleManagerTest ‑ testCreateShuffleManagerServer
org.apache.spark.shuffle.RssShuffleManagerTest ‑ testGetDataDistributionType
org.apache.spark.shuffle.RssShuffleManagerTest ‑ testRssShuffleManagerInterface
org.apache.spark.shuffle.RssShuffleManagerTest ‑ testRssShuffleManagerRegisterShuffle{int}[1]
…
org.apache.uniffle.spark.shuffle.DelegationRssShuffleManagerTest ‑ testCreateFallback
org.apache.uniffle.spark.shuffle.DelegationRssShuffleManagerTest ‑ testCreateInDriver
org.apache.uniffle.spark.shuffle.DelegationRssShuffleManagerTest ‑ testCreateInDriverDenied
org.apache.uniffle.spark.shuffle.DelegationRssShuffleManagerTest ‑ testCreateInExecutor
org.apache.uniffle.spark.shuffle.DelegationRssShuffleManagerTest ‑ testTryAccessCluster
org.apache.uniffle.spark.shuffle.FunctionUtilsTests ‑ testOnceFunction0
org.apache.uniffle.spark.shuffle.RssShuffleManagerTest ‑ testCreateShuffleManagerServer
org.apache.uniffle.spark.shuffle.RssShuffleManagerTest ‑ testGetDataDistributionType
org.apache.uniffle.spark.shuffle.RssShuffleManagerTest ‑ testRssShuffleManagerInterface
org.apache.uniffle.spark.shuffle.RssShuffleManagerTest ‑ testRssShuffleManagerRegisterShuffle{int}[1]
…

:recycle: This comment has been updated with latest results.

github-actions[bot] avatar Jul 10 '24 02:07 github-actions[bot]

Why do we put them in one package? Because we want to use default package method easily.

jerqi avatar Jul 10 '24 05:07 jerqi

image

@jerqi Thanks for your explain, but in the other side, we could meet MethodNotDefException future in the following scenario:

  • If spark create a class have the same package and class name with the above screenshot, we could use a conflict class.
  • If some other dependencies have the same class and package name, we could use a conflict class too.

So, I have to suggest to make sure all of our uniffle client classes are under org.apache.uniffle package or relocated by shaded maven plugins, if we do like this, we can reduce trouble of conflict class.

maobaolong avatar Jul 10 '24 11:07 maobaolong

This is a breaking change, too. If you need to extend and use some default privilege methods, it will be inconvenient. Conflicts won't be a big issue. I prefer not shading this.

jerqi avatar Jul 10 '24 11:07 jerqi

@jerqi I see. So how about another resolution, we do not add a uniffle pacakge name, instead, we add a class prefix with Uniffle. e.g. SparkVersionUtils -> UnifleSparkVersionUtils ?

BTW, I'm open to keep it like it is now, we can merge this PR after we first encounter a class conflict issue involved by this.

maobaolong avatar Jul 10 '24 12:07 maobaolong

@jerqi I study celeborn, css(byteDance), shuttle(oppo) , just paste some thoughts from my point. All of them put the client class entry point and related classes under a specific package, they are celeborn/css/shuttle, but for uniffle, we put it into shuffle package and named like RssXXX, which is too generic.

  • celeborn image
  • css https://github.com/bytedance/CloudShuffleService/blob/main/spark-shuffle-manager-3/src/main/scala/org/apache/spark/shuffle/css/CssShuffleManager.scala
  • shuttle https://github.com/cubefs/shuttle/blob/master/src/main/scala/org/apache/spark/shuffle/Ors2ShuffleManager.scala

maobaolong avatar Jul 31 '24 01:07 maobaolong