hustle
hustle copied to clipboard
HashAggregate
This issue extends #57 to make sure HashAggregate delivers the following:
- [x] Replace the
std::unordered_maphash tables tophmap(ideally) for better performance. - [x] Add a toggle in
AggregateStrategyto flip betweenHashAggregateandAggregate. We can also introduce this toggle inssb_workload.ccandssb_workload_lip.ccto compare the performance. - [ ] Handle hash conflict more elegantly. (Using phmap and custom hash_combine function can eliminate the problem).
- [ ] Design an abstraction over the hash table, and remove some redundant type-checking codes of the arrow arrays.
- [x] Refactor
arrow_compute_wrapperand add some helpers to optimize the array construction. - [ ] Eliminate most data copy between
std::andarrow::. - [ ] No memory leak. Free all memory after the
Finish(). - [ ] Make code readable: use less
auto
See #67 for primer benchmark result.