delta-rs
delta-rs copied to clipboard
Support bloom filter table indexes
Description
Use Case
This could help speed up table scans. This feature is not documented in the official spec yet, see more details in https://docs.databricks.com/delta/optimizations/bloom-filters.html.
Any update on this? I would really love to see an ability to do filter on the tables.
Databricks still hasn't open-sourced this feature I believe.
So this is only doable after Databricks open-souces or at least releases the official spec, right?
welp, we can always reverse engineer the format if anyone is interested in doing that :D
While trying to fix parquet2 builds in an PR, I realized that parquet2 has at least some support for bloom filters. https://github.com/jorgecarleitao/parquet2/tree/main/src/bloom_filter.
Just leaving this here for reference 😄.