hustle
hustle copied to clipboard
HashAggregate
This issue extends #57 to make sure HashAggregate
delivers the following:
- [x] Replace the
std::unordered_map
hash tables tophmap
(ideally) for better performance. - [x] Add a toggle in
AggregateStrategy
to flip betweenHashAggregate
andAggregate
. We can also introduce this toggle inssb_workload.cc
andssb_workload_lip.cc
to compare the performance. - [ ] Handle hash conflict more elegantly. (Using phmap and custom hash_combine function can eliminate the problem).
- [ ] Design an abstraction over the hash table, and remove some redundant type-checking codes of the arrow arrays.
- [x] Refactor
arrow_compute_wrapper
and add some helpers to optimize the array construction. - [ ] Eliminate most data copy between
std::
andarrow::
. - [ ] No memory leak. Free all memory after the
Finish()
. - [ ] Make code readable: use less
auto
See #67 for primer benchmark result.