gradoop
gradoop copied to clipboard
[#1433] Add vertex retention to KeyedGrouping.
Adds the feature to retain ungrouped vertices to KeyedGrouping
.
Description
Adds the feature as well as tests.
Related Issue
#1433
Motivation and Context
This was supported in the older grouping implementation, this is now also supported here.
How Has This Been Tested?
Reuses some old tests and adds new tests.
Types of Changes
- [ ] Bug fix (non-breaking change which fixes an issue)
- [X] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
Checklist:
- [X] My code follows the code style of this project.
- [ ] I have updated the documentation accordingly (if necessary).
- [X] I have added tests to cover my changes.
- [X] All new and existing tests passed.
- [X] I ran a spell checker.
Change year in headers from 2020 to 2021.
I did a few small benchmarks on the cluster. I was comparing:
- a: no retention
- b: this implementation
- c: alternative implementation that does not need any additional join, but delays the translation of edges to tuples until after the id update
Results:
- a is faster than b and c
- low parallelism (ldbc_10, 16 workers): b and c are ~200% slower
- high parallelism (ldbc_10, 96 workers): b and c are ~10-40% slower
- b is faster than c
- low parallelism: about the same
- high parallelism: c is ~30% slower
Based on that result, the additional join is no problem. In the next few days i will commit a few documentation and style improvements.
@timo95 for further commits, please add the issue number as prefix of your commit message ;)
I am finished anyway 😃. You can review now