velox
velox copied to clipboard
[WIP] Use HashStringAllocator for raw_vector rows in HashLookup
The rows in the HashLookup are stored as raw_vector. The raw_vector uses a malloc type allocation to acquire aligned memory. It was found this contributes to large unaccounted memory when mainly using the HashProbe and other operators.
This change allows the usage of a custom allocator for the raw_vector. By default it keeps the current way of allocating and deallocating.
For raw_vectors representing rows in the HashLookup the allocator used is the HashStringAllocator which allows for the accounting of the memory.
Deploy Preview for meta-velox canceled.
| Name | Link |
|---|---|
| Latest commit | 2ece662dfb30cc243738a364d4d102fcf5a3d7f8 |
| Latest deploy log | https://app.netlify.com/sites/meta-velox/deploys/671ac4cfe8b82a00082ed753 |
This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the PR, make sure you've addressed reviewer comments, and rebase on the latest main. Thank you for your contributions!
This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the PR, make sure you've addressed reviewer comments, and rebase on the latest main. Thank you for your contributions!
In the end, the solution used was to pass in memory pool to handle the allocations and not use the HashAllocator. The problem with the the latter was that spilled rows were lost.
The actual fix is here: https://github.com/facebookincubator/velox/pull/12582