incubator-uniffle icon indicating copy to clipboard operation
incubator-uniffle copied to clipboard

[Improvement] Optimize local disk selection strategy

Open zuston opened this issue 3 years ago • 6 comments

Code of Conduct

Search before asking

  • [X] I have searched in the issues and found no similar issues.

What would you like to be improved?

I want to raise this issue to improve stability when using MEMORY_LOCALFILE storage type. Maybe some issues will be as sub-tasks in this improvement.

The first improvement is to avoid all apps fail when single disk capacity reaches high-watermark. We could do below optimizations.

  1. Introduce the metrics of TOP10 apps which use the number of written bytes #333 .
  2. Introduce the free space & total space metrics of every local disk
  3. Introduce the pluggable disk selection strategy. Currently the disk will be selected based on the hash. Free-capacity based strategy should be supported.
  4. Allow app write data to another disk when encountering the corresponding disk reaching high-watermark #306

How should we improve?

No response

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

zuston avatar Nov 29 '22 07:11 zuston

PTAL @jerqi @xianjingfeng @leixm @smallzhongfeng @kaijchen

zuston avatar Nov 29 '22 07:11 zuston

  1. We choose hash selection strategy. Because we want to reduce the size of meta data which we need maintain in the memory.

jerqi avatar Nov 29 '22 10:11 jerqi

  1. Can we use Consistent Hashing?

xianjingfeng avatar Nov 30 '22 04:11 xianjingfeng

Introduce the pluggable disk selection strategy. Currently the disk will be selected based on the hash. Free-capacity based strategy should be supported.

Agreed. Currently the hash based strategy may cause unbalanced disk I/Os among different disks as app's shuffle patterns may vary dramatically. Capacity and disk-stats based strategy is very nice to have.

advancedxy avatar Dec 09 '22 08:12 advancedxy

Introduce the free space & total space metrics of every local disk

@zuston how do you plan to collect these metrics? By using df, or any other fancy ways?

advancedxy avatar Dec 13 '22 09:12 advancedxy

Interesting feature

maobaolong avatar Nov 16 '24 14:11 maobaolong