databend
databend copied to clipboard
feat: experimental runtime bloom pruning
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
Implements runtime pruning for probe-side data blocks by utilizing the runtime filter (based on the min-max filter) and the bloom filter index of the probe table.
-
replace range filter expression with
eq
filter expressions if min equals max while constructing the min-max filtersthe
eq
filter expression is compatible with both rangeindex and bloom index -
during runtime filtering (of probe side data), if runtime min-max pruning failed, the bloom filter will be tried.
-
add new profile metric
RuntimeBloomFilterPrunedParts
, which records the number of blocks pruned by bloom filter -
Fixes #[Link the issue here]
Tests
- [ ] Unit Test
- [x] Logic Test
- [ ] Benchmark Test
- [ ] No Test - Explain why
Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Breaking Change (fix or feature that could cause existing functionality not to work as expected)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Docker Image for PR
-
tag:
pr-15382-6ac94f2
note: this image tag is only available for internal use, please check the internal doc for more details.
ClickBench Report
- hits: https://benchmark.databend.rs/clickbench/pr/15382/8906600944/hits.html
- internal: https://benchmark.databend.rs/clickbench/pr/15382/8906600944/internal.html
- load: https://benchmark.databend.rs/clickbench/pr/15382/8906600944/load.html
- tpch: https://benchmark.databend.rs/clickbench/pr/15382/8906600944/tpch.html
Docker Image for PR
-
tag:
pr-15382-3037a5f
note: this image tag is only available for internal use, please check the internal doc for more details.
ClickBench Report
- hits: https://benchmark.databend.rs/clickbench/pr/15382/8968592119/hits.html
- internal: https://benchmark.databend.rs/clickbench/pr/15382/8968592119/internal.html
- load: https://benchmark.databend.rs/clickbench/pr/15382/8968592119/load.html
- tpch: https://benchmark.databend.rs/clickbench/pr/15382/8968592119/tpch.html
@xudong963 Thanks for helping me review this PR; really appreciate it. Let me try to make further adjustments to avoid using the bloom filter in situations where false positives could nearly make bloom pruning ineffective.