[Improvement] support sequential unique block id
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Search before asking
- [X] I have searched in the issues and found no similar issues.
What would you like to be improved?
The problem of block id overflow is described in #731, #1398. The block id is used to verify whether the accurate block set is obtained. I think we can get the sequence id from shuffle server. It will be almost impossible for overflow to occur. In map side, we can generate local block id. Then we report shuffle result. Shuffle will map the local block id to global sequential unique block id. In reduce side, we get shuffle result, then get the global sequential unique block id set. Since block id is sequential, we only need to pass the length of the bock set.
How should we improve?
No response
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
This proposal has major changes to the API and needs to be fully discussed before development.
For spark, the blockID is used to filter unused ids by upstream mapID
Could you help describe more about this design, how about drafting a proposal?