incubator-uniffle
incubator-uniffle copied to clipboard
[Bug] Incorrect size in `LocalStorageMeta`
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Search before asking
- [X] I have searched in the issues and found no similar issues.
Describe the bug
I found some shuffle servers in our cluster write shuffle data to HDFS frequently. And then i found the size stored in LocalStorageMeta is incorrect.
Maybe we should update the metrics in the following method. There may be other places that have been missed.
https://github.com/apache/incubator-uniffle/blob/b2154c722f30ee31a867561f550254842c33cbab/server/src/main/java/org/apache/uniffle/server/storage/LocalStorageManager.java#L346
Other suggestion:
- Maybe it is not a good way that judging if the local disk can be written by using this metric, because this metric is easily overlooked and it is hard to guarantee it's accurate. And #1071 trying to change this.
- I found
shuffleMetaMapinLocalStorageMetais useless, maybe we should remove it. And then we can removeLocalStorageMetaat the same time.
Affects Version(s)
master
Uniffle Server Log Output
No response
Uniffle Engine Log Output
No response
Uniffle Server Configurations
No response
Uniffle Engine Configurations
No response
Additional context
No response
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
@jerqi @zuston @advancedxy @sfwang218 PTAL
Maybe we should update the metrics in the following method. There may be other places that have been missed.
How to update? The updated value maybe not the consistent with the real disk usage.
Maybe we should update the metrics in the following method. There may be other places that have been missed.
How to update? The updated value maybe not the consistent with the real disk usage.
You are right. So i prefer to remove LocalStorageMeta.
Maybe we should update the metrics in the following method. There may be other places that have been missed.
How to update? The updated value maybe not the consistent with the real disk usage.
You are right. So i prefer to remove
LocalStorageMeta.
+1 Please go ahead
@xianjingfeng @zuston Maybe the root cause is that we change the storage or storage manager in hadoopThreadPoolExecutor or
localFileThreadPoolExecutor which is designed to handle specified storage type.
The storage and storageManager who handle the ShuffleDataFlushEvent are messed up.
Besides, some metrics updates concurrently using the wrong key, for example incStorageFailedCounter is called while the storage under the event may have already changed.
To solve this issue and make the storage selection logic more clear, may be we could try the below way
- Mark and get
storageorstageManageronly throughShuffleDataFlushEvent - remove the concurrent logic to
ShuffleDataFlushEvent
I'm working on this
@xianjingfeng @zuston Maybe the root cause is that we change the storage or storage manager in
hadoopThreadPoolExecutororlocalFileThreadPoolExecutorwhich is designed to handle specified storage type. ThestorageandstorageManagerwho handle theShuffleDataFlushEventare messed up. Besides, some metrics updates concurrently using the wrong key, for exampleincStorageFailedCounteris called while the storage under the event may have already changed.
This may be just one of the cases.
I found shuffleMetaMap in LocalStorageMeta is useless, maybe we should remove it. And then we can remove LocalStorageMeta at the same time.
If LocalStageMeta is removed, how to support LocalStorage mode?
I found shuffleMetaMap in LocalStorageMeta is useless, maybe we should remove it. And then we can remove LocalStorageMeta at the same time.
If LocalStageMeta is removed, how to support LocalStorage mode?
Sorry, i can't get your point.
There are only two fields in LocalStageMeta, shuffleMetaMap and size.
shuffleMetaMap is useless, right?
And size is just for judging whether the local disk can be written, right?
So if we judge whether the local disk can be written by monitoring the usage of the local disk, size becomes useless too.
shuffleMetaMap is useless, right?
I don't think ShuffleMetaMap is useless. I believe its intention is to track every shuffleId's shuffle size and last read time. It should be updated in a thread safe manner. Without this metadata, it would be hard to track how many data each shuffle has been wrote, data management operations such as: quota limit per app/shuffle, back pressure or traffic throttle would be impossible.
And size is just for judging whether the local disk can be written, right?
Currently, there's a capacity configuration for shuffle server. It is possible that the shuffle server's disk is shared by other services, although not recommended. So we cannot simply check if the disk has free space to determine whether it's writable.
If we are going to remove LocalStorageMeta, I think we may have to add it back in the future to support advanced features.
shuffleMetaMap is useless, right?
I don't think ShuffleMetaMap is useless. I believe its intention is to track every shuffleId's shuffle size and last read time. It should be updated in a thread safe manner. Without this metadata, it would be hard to track how many data each shuffle has been wrote, data management operations such as: quota limit per app/shuffle, back pressure or traffic throttle would be impossible.
And size is just for judging whether the local disk can be written, right?
Currently, there's a capacity configuration for shuffle server. It is possible that the shuffle server's disk is shared by other services, although not recommended. So we cannot simply check if the disk has free space to determine whether it's writable.
If we are going to remove LocalStorageMeta, I think we may have to add it back in the future to support advanced features.
It's okay to keep it. Let's focus on how do we solve this problem and how can we avoid this problem from happening again in the future.
Can we close this? @xianjingfeng
Can we close this? @xianjingfeng
I don't think it can be closed.The problem still exists, we just don't use this variable anymore.