celeborn icon indicating copy to clipboard operation
celeborn copied to clipboard

[CELEBORN-914] Support memory file storage

Open FMX opened this issue 1 year ago • 2 comments

What changes were proposed in this pull request?

To support memory file storage.

Why are the changes needed?

To improve shuffle performance for small shuffle files.

Design doc: https://docs.google.com/document/d/1SM-oOM0JHEIoRHTYhE9PYH60_1D3NMxDR50LZIM7uW0/edit?usp=sharing

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass GA and manually test on a cluster. Here is a brief performance test result. TPC-DS 10G on a 4 node Celeborn cluster(20vCPUs 88 GBs 8xHDD): Without memory file storage: 983s With memory file storage: 791s About 15% performance improvement.

FMX avatar Feb 19 '24 01:02 FMX

Codecov Report

Attention: Patch coverage is 73.15436% with 40 lines in your changes are missing coverage. Please review.

Project coverage is 49.33%. Comparing base (21d5698) to head (52cd7d8). Report is 8 commits behind head on main.

Files Patch % Lines
...java/org/apache/celeborn/common/meta/FileInfo.java 62.86% 10 Missing and 3 partials :warning:
...rg/apache/celeborn/common/meta/MemoryFileInfo.java 52.95% 8 Missing :warning:
...rg/apache/celeborn/common/meta/ReduceFileMeta.java 70.00% 6 Missing :warning:
...he/celeborn/common/util/ShuffleBlockInfoUtils.java 83.34% 0 Missing and 6 partials :warning:
...g/apache/celeborn/common/protocol/StorageInfo.java 50.00% 2 Missing :warning:
...cala/org/apache/celeborn/common/CelebornConf.scala 88.24% 1 Missing and 1 partial :warning:
...org/apache/celeborn/common/util/PbSerDeUtils.scala 0.00% 2 Missing :warning:
...e/celeborn/common/network/buffer/ChunkBuffers.java 75.00% 0 Missing and 1 partial :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2300      +/-   ##
==========================================
+ Coverage   48.96%   49.33%   +0.38%     
==========================================
  Files         209      211       +2     
  Lines       13102    13238     +136     
  Branches     1134     1149      +15     
==========================================
+ Hits         6414     6530     +116     
- Misses       6270     6276       +6     
- Partials      418      432      +14     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Feb 19 '24 10:02 codecov[bot]

Codecov Report

Attention: Patch coverage is 9.49367% with 143 lines in your changes are missing coverage. Please review.

Project coverage is 38.98%. Comparing base (121395f) to head (53555a1). Report is 51 commits behind head on main.

:exclamation: Current head 53555a1 differs from pull request most recent head 137ab52

Please upload reports for the commit 137ab52 to get more accurate results.

Files Patch % Lines
...he/celeborn/common/util/ShuffleBlockInfoUtils.java 0.00% 36 Missing :warning:
...java/org/apache/celeborn/common/meta/FileInfo.java 0.00% 35 Missing :warning:
...rg/apache/celeborn/common/meta/MemoryFileInfo.java 0.00% 24 Missing :warning:
...rg/apache/celeborn/common/meta/ReduceFileMeta.java 0.00% 20 Missing :warning:
...leborn/common/network/buffer/FileChunkBuffers.java 0.00% 6 Missing :warning:
...born/common/network/buffer/MemoryChunkBuffers.java 0.00% 6 Missing :warning:
...cala/org/apache/celeborn/common/CelebornConf.scala 68.43% 6 Missing :warning:
...e/celeborn/common/network/buffer/ChunkBuffers.java 0.00% 4 Missing :warning:
.../org/apache/celeborn/common/meta/DiskFileInfo.java 0.00% 2 Missing :warning:
...g/apache/celeborn/common/protocol/StorageInfo.java 50.00% 2 Missing :warning:
... and 1 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2300      +/-   ##
==========================================
- Coverage   40.17%   38.98%   -1.19%     
==========================================
  Files         218      219       +1     
  Lines       13742    13547     -195     
  Branches     1214     1191      -23     
==========================================
- Hits         5520     5280     -240     
- Misses       7905     7966      +61     
+ Partials      317      301      -16     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Apr 07 '24 01:04 codecov-commenter

@FMX metrics added by this PR is not added to the Celeborn website Monitoring page. Also, should we start adding such changes in the release notes as well. WDYT?

s0nskar avatar Aug 14 '24 14:08 s0nskar