hustle icon indicating copy to clipboard operation
hustle copied to clipboard

HashAggregate

Open GindaChen opened this issue 3 years ago • 1 comments

This issue extends #57 to make sure HashAggregate delivers the following:

  • [x] Replace the std::unordered_map hash tables to phmap (ideally) for better performance.
  • [x] Add a toggle in AggregateStrategy to flip between HashAggregate and Aggregate. We can also introduce this toggle in ssb_workload.cc and ssb_workload_lip.cc to compare the performance.
  • [ ] Handle hash conflict more elegantly. (Using phmap and custom hash_combine function can eliminate the problem).
  • [ ] Design an abstraction over the hash table, and remove some redundant type-checking codes of the arrow arrays.
  • [x] Refactor arrow_compute_wrapper and add some helpers to optimize the array construction.
  • [ ] Eliminate most data copy between std:: and arrow::.
  • [ ] No memory leak. Free all memory after the Finish().
  • [ ] Make code readable: use less auto

GindaChen avatar Nov 05 '20 16:11 GindaChen

See #67 for primer benchmark result.

GindaChen avatar Nov 20 '20 20:11 GindaChen