incubator-uniffle
incubator-uniffle copied to clipboard
Write to hdfs when local disk can't be write
What changes were proposed in this pull request?
Write to hdfs when local disk can't be write
Why are the changes needed?
There should be a fallback mechanism when disk can't be write. #163
Does this PR introduce any user-facing change?
No
How was this patch tested?
Already added
Codecov Report
Merging #235 (5d5767b) into master (47effb2) will decrease coverage by
0.71%
. The diff coverage is69.44%
.
@@ Coverage Diff @@
## master #235 +/- ##
============================================
- Coverage 59.71% 58.99% -0.72%
+ Complexity 1377 1336 -41
============================================
Files 166 166
Lines 8918 8570 -348
Branches 853 840 -13
============================================
- Hits 5325 5056 -269
+ Misses 3318 3233 -85
- Partials 275 281 +6
Impacted Files | Coverage Δ | |
---|---|---|
...he/uniffle/server/storage/MultiStorageManager.java | 49.23% <48.83%> (+11.73%) |
:arrow_up: |
...er/storage/HdfsStorageManagerFallbackStrategy.java | 71.42% <71.42%> (ø) |
|
...r/storage/LocalStorageManagerFallbackStrategy.java | 71.42% <71.42%> (ø) |
|
...e/uniffle/server/storage/SingleStorageManager.java | 67.64% <71.42%> (+0.43%) |
:arrow_up: |
...torage/AbstractStorageManagerFallbackStrategy.java | 75.00% <75.00%> (ø) |
|
...org/apache/uniffle/server/ShuffleFlushManager.java | 78.80% <100.00%> (+0.11%) |
:arrow_up: |
...a/org/apache/uniffle/server/ShuffleServerConf.java | 99.21% <100.00%> (+0.03%) |
:arrow_up: |
.../storage/RotateStorageManagerFallbackStrategy.java | 100.00% <100.00%> (ø) |
|
...ava/org/apache/uniffle/common/RssShuffleUtils.java | 0.00% <0.00%> (-95.66%) |
:arrow_down: |
.../java/org/apache/hadoop/mapreduce/RssMRConfig.java | 23.07% <0.00%> (-51.93%) |
:arrow_down: |
... and 28 more |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
There are some flaky ut
java.lang.ClassCastException: org.apache.spark.shuffle.RssShuffleManager cannot be cast to org.apache.uniffle.test.GetShuffleReportForMultiPartTest$RssShuffleManagerWrapper at org.apache.uniffle.test.GetShuffleReportForMultiPartTest.runTest(GetShuffleReportForMultiPartTest.java:180) at org.apache.uniffle.test.SparkIntegrationTestBase.runSparkApp(SparkIntegrationTestBase.java:74) at org.apache.uniffle.test.SparkIntegrationTestBase.run(SparkIntegrationTestBase.java:52) at org.apache.uniffle.test.GetShuffleReportForMultiPartTest.resultCompareTest(GetShuffleReportForMultiPartTest.java:141) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
Error: Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 12.256 s <<< FAILURE! - in org.apache.uniffle.coordinator.LowestIOSampleCostSelectStorageStrategyTest Error: selectStorageTest Time elapsed: 6.083 s <<< FAILURE! org.opentest4j.AssertionFailedError: expected: <hdfs://p2> but was: <hdfs://p1> at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62) at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182) at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177) at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1141) at org.apache.uniffle.coordinator.LowestIOSampleCostSelectStorageStrategyTest.selectStorageTest(LowestIOSampleCostSelectStorageStrategyTest.java:133)
I think this PR is a good improvement! We also need this PR to avoid the problem of full local disk, although we dont hope to enable the big block directly written to HDFS.
Do you have time to invest this PR, I hope this can be introduced in our company internal version, looking forward to be merged assp. @xianjingfeng
Wait for CI
@zuston Gently ping.
Merged. @xianjingfeng Thanks for your contribution