databend icon indicating copy to clipboard operation
databend copied to clipboard

feat: experimental runtime bloom pruning

Open dantengsky opened this issue 9 months ago โ€ข 5 comments

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Implements runtime pruning for probe-side data blocks by utilizing the runtime filter (based on the min-max filter) and the bloom filter index of the probe table.

  • replace range filter expression with eq filter expressions if min equals max while constructing the min-max filters

    the eq filter expression is compatible with both rangeindex and bloom index

  • during runtime filtering (of probe side data), if runtime min-max pruning failed, the bloom filter will be tried.

  • add new profile metric RuntimeBloomFilterPrunedParts, which records the number of blocks pruned by bloom filter

  • Fixes #[Link the issue here]

Tests

  • [ ] Unit Test
  • [x] Logic Test
  • [ ] Benchmark Test
  • [ ] No Test - Explain why

Type of change

  • [ ] Bug Fix (non-breaking change which fixes an issue)
  • [x] New Feature (non-breaking change which adds functionality)
  • [ ] Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • [ ] Documentation Update
  • [ ] Refactoring
  • [ ] Performance Improvement
  • [ ] Other (please describe):

This change isโ€‚Reviewable

dantengsky avatar Apr 30 '24 06:04 dantengsky

Docker Image for PR

  • tag: pr-15382-6ac94f2

note: this image tag is only available for internal use, please check the internal doc for more details.

github-actions[bot] avatar May 01 '24 07:05 github-actions[bot]

ClickBench Report

  • hits: https://benchmark.databend.rs/clickbench/pr/15382/8906600944/hits.html
  • internal: https://benchmark.databend.rs/clickbench/pr/15382/8906600944/internal.html
  • load: https://benchmark.databend.rs/clickbench/pr/15382/8906600944/load.html
  • tpch: https://benchmark.databend.rs/clickbench/pr/15382/8906600944/tpch.html

github-actions[bot] avatar May 01 '24 08:05 github-actions[bot]

Docker Image for PR

  • tag: pr-15382-3037a5f

note: this image tag is only available for internal use, please check the internal doc for more details.

github-actions[bot] avatar May 06 '24 12:05 github-actions[bot]

ClickBench Report

  • hits: https://benchmark.databend.rs/clickbench/pr/15382/8968592119/hits.html
  • internal: https://benchmark.databend.rs/clickbench/pr/15382/8968592119/internal.html
  • load: https://benchmark.databend.rs/clickbench/pr/15382/8968592119/load.html
  • tpch: https://benchmark.databend.rs/clickbench/pr/15382/8968592119/tpch.html

github-actions[bot] avatar May 06 '24 13:05 github-actions[bot]

@xudong963 Thanks for helping me review this PR; really appreciate it. Let me try to make further adjustments to avoid using the bloom filter in situations where false positives could nearly make bloom pruning ineffective.

dantengsky avatar May 13 '24 01:05 dantengsky