incubator-uniffle icon indicating copy to clipboard operation
incubator-uniffle copied to clipboard

Write to hdfs when local disk can't be write

Open xianjingfeng opened this issue 2 years ago • 3 comments

What changes were proposed in this pull request?

Write to hdfs when local disk can't be write

Why are the changes needed?

There should be a fallback mechanism when disk can't be write. #163

Does this PR introduce any user-facing change?

No

How was this patch tested?

Already added

xianjingfeng avatar Sep 21 '22 10:09 xianjingfeng

Codecov Report

Merging #235 (5d5767b) into master (47effb2) will decrease coverage by 0.71%. The diff coverage is 69.44%.

@@             Coverage Diff              @@
##             master     #235      +/-   ##
============================================
- Coverage     59.71%   58.99%   -0.72%     
+ Complexity     1377     1336      -41     
============================================
  Files           166      166              
  Lines          8918     8570     -348     
  Branches        853      840      -13     
============================================
- Hits           5325     5056     -269     
+ Misses         3318     3233      -85     
- Partials        275      281       +6     
Impacted Files Coverage Δ
...he/uniffle/server/storage/MultiStorageManager.java 49.23% <48.83%> (+11.73%) :arrow_up:
...er/storage/HdfsStorageManagerFallbackStrategy.java 71.42% <71.42%> (ø)
...r/storage/LocalStorageManagerFallbackStrategy.java 71.42% <71.42%> (ø)
...e/uniffle/server/storage/SingleStorageManager.java 67.64% <71.42%> (+0.43%) :arrow_up:
...torage/AbstractStorageManagerFallbackStrategy.java 75.00% <75.00%> (ø)
...org/apache/uniffle/server/ShuffleFlushManager.java 78.80% <100.00%> (+0.11%) :arrow_up:
...a/org/apache/uniffle/server/ShuffleServerConf.java 99.21% <100.00%> (+0.03%) :arrow_up:
.../storage/RotateStorageManagerFallbackStrategy.java 100.00% <100.00%> (ø)
...ava/org/apache/uniffle/common/RssShuffleUtils.java 0.00% <0.00%> (-95.66%) :arrow_down:
.../java/org/apache/hadoop/mapreduce/RssMRConfig.java 23.07% <0.00%> (-51.93%) :arrow_down:
... and 28 more

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov-commenter avatar Sep 21 '22 10:09 codecov-commenter

There are some flaky ut

java.lang.ClassCastException: org.apache.spark.shuffle.RssShuffleManager cannot be cast to org.apache.uniffle.test.GetShuffleReportForMultiPartTest$RssShuffleManagerWrapper at org.apache.uniffle.test.GetShuffleReportForMultiPartTest.runTest(GetShuffleReportForMultiPartTest.java:180) at org.apache.uniffle.test.SparkIntegrationTestBase.runSparkApp(SparkIntegrationTestBase.java:74) at org.apache.uniffle.test.SparkIntegrationTestBase.run(SparkIntegrationTestBase.java:52) at org.apache.uniffle.test.GetShuffleReportForMultiPartTest.resultCompareTest(GetShuffleReportForMultiPartTest.java:141) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

Error: Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 12.256 s <<< FAILURE! - in org.apache.uniffle.coordinator.LowestIOSampleCostSelectStorageStrategyTest Error: selectStorageTest Time elapsed: 6.083 s <<< FAILURE! org.opentest4j.AssertionFailedError: expected: <hdfs://p2> but was: <hdfs://p1> at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62) at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182) at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177) at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1141) at org.apache.uniffle.coordinator.LowestIOSampleCostSelectStorageStrategyTest.selectStorageTest(LowestIOSampleCostSelectStorageStrategyTest.java:133)

xianjingfeng avatar Sep 22 '22 08:09 xianjingfeng

I think this PR is a good improvement! We also need this PR to avoid the problem of full local disk, although we dont hope to enable the big block directly written to HDFS.

zuston avatar Sep 26 '22 02:09 zuston

Do you have time to invest this PR, I hope this can be introduced in our company internal version, looking forward to be merged assp. @xianjingfeng

zuston avatar Oct 10 '22 09:10 zuston

Wait for CI

jerqi avatar Oct 27 '22 12:10 jerqi

@zuston Gently ping.

jerqi avatar Oct 28 '22 07:10 jerqi

Merged. @xianjingfeng Thanks for your contribution

zuston avatar Oct 28 '22 07:10 zuston