[NEMO-472] Implement Intermediate Combine
JIRA: NEMO-472: Fix and Implement Hierarchical Aggregation
Major changes: [NEMO-472: Implement Hierarchical Aggregation] aims to add additional intermediate accumulation operator in front of final combine operator that accumulates data among physically nearby containers prior to shuffling across WAN, when needed. It is expected that data aggregation among nearby containers will reduce the data size that must be transferred across WAN. To achieve it,
- Implemented intermediate combine transform
- Previous Combine.PerKey Transform consisted of 2 steps.
- Partial Combine(a.k.a. pre-aggregation): accumulates elements in each containers. Therefore, data transfer across network is not needed in this step.
- Final Combine: shuffle all data(hashed by key) and then combine.
- Additional, and optional step that accumulates the pre-aggregated data partially(only among nearby containers) is implemented and inserted between 1(partial) and 2(final).
- This new type of transform is only used in intermediate accumulator vertex, which is special type of operator vertex.
- Previous Combine.PerKey Transform consisted of 2 steps.
- Added new type of communication channel, Partial Shuffle, which represents data transfer from upstream operator to intermediate accumulator vertex. It resembles shuffle, but the difference is that data shuffle occurs only among physically nearby containers.
- Implemented compile time optimization pass that inserts intermediate accumulator vertex, which performs hierarchical aggregation prior to shuffle, only when it is expected to be effective.
- Implemented unit tests.
Minor changes to note:
- None
Tests for the changes:
- Tested on my Mac and ubuntu machine
Other comments:
- Data transfer on partial shuffle communication channel is implemented in #319.
- [TODO] Need more conditions to be implemented to make decision whether applying the pass is effective or not. Current logic is too naive.
Kudos, SonarCloud Quality Gate passed! 
0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells
No Coverage information
0.0% Duplication
@taegeonum Thanks for the review! I've addressed your comments.
@Kangji Any update?
@Kangji Any update?
not yet... :( It has been delayed due to the fall semester, even though i'm trying to do asap. I'll let you know.
Kudos, SonarCloud Quality Gate passed! 
0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells
No Coverage information
0.0% Duplication
@taegeonum Can you take a final look?