[Improvement] Optimize local disk selection strategy
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Search before asking
- [X] I have searched in the issues and found no similar issues.
What would you like to be improved?
I want to raise this issue to improve stability when using MEMORY_LOCALFILE storage type. Maybe some issues will be as sub-tasks in this improvement.
The first improvement is to avoid all apps fail when single disk capacity reaches high-watermark. We could do below optimizations.
- Introduce the metrics of TOP10 apps which use the number of written bytes #333 .
- Introduce the free space & total space metrics of every local disk
- Introduce the pluggable disk selection strategy. Currently the disk will be selected based on the hash. Free-capacity based strategy should be supported.
- Allow app write data to another disk when encountering the corresponding disk reaching high-watermark #306
How should we improve?
No response
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
PTAL @jerqi @xianjingfeng @leixm @smallzhongfeng @kaijchen
- We choose hash selection strategy. Because we want to reduce the size of meta data which we need maintain in the memory.
- Can we use Consistent Hashing?
Introduce the pluggable disk selection strategy. Currently the disk will be selected based on the hash. Free-capacity based strategy should be supported.
Agreed. Currently the hash based strategy may cause unbalanced disk I/Os among different disks as app's shuffle patterns may vary dramatically. Capacity and disk-stats based strategy is very nice to have.
Introduce the free space & total space metrics of every local disk
@zuston how do you plan to collect these metrics? By using df, or any other fancy ways?
Interesting feature