alluxio
alluxio copied to clipboard
Skip duplicated persist request
What changes are proposed in this pull request?
Skip the duplicated persist request, if there is a persist request is being processed.
Design:
- Use a Set to maintain all the file ids that is being processed
- Call Set.add at the entrance of the state machine of persist
- Call Set.remove when the final state of the state machine is reached
Why are the changes needed?
- Simplify the state machine of Persist
- Reduce data uploaded to ufs and save bandwidth
Does this PR introduce any user facing changes?
None
#16511