GH-32107: [C++] Create Filter Kernels for REE values
Rationale for this change
Run-end encoded arrays can very compactly represent filters with repeated values and run-end encoded arrays can be filtered efficiently without inflating all the runs for filtering. This PR implements algorithms that make it possible to leverage these possibilities.
What changes are included in this PR?
- [x] REE x REE filters kernels (for the same types supported in
run_end_encode) Closes #13657 - [x] Plain values x REE filter
- [x] REE values x Plain filter (boolean bitmap)
Are these changes tested?
By unit tests that handle all the REE/Plain combinations.
Are there any user-facing changes?
Users don't control which filter kernels get picked directly, so there isn't a user-facing API area yet.
- Closes: #32107
- Closes: #32107
:warning: GitHub issue #32107 has been automatically assigned in GitHub to PR creator.
:warning: GitHub issue #32107 has been automatically assigned in GitHub to PR creator.
:warning: GitHub issue #32107 has been automatically assigned in GitHub to PR creator.
@felipecrv do you have any anecdotal benchmark numbers?
@felipecrv do you have any anecdotal benchmark numbers?
@alippai For PlainxREE, only the intuition that for a sufficiently run-end-compressed filter it will be faster than scanning a full bitmap.
But I will be writing benchmarks for the 3 combinations when I complete the unit tests.