duckdb_iceberg
duckdb_iceberg copied to clipboard
[Predicate Pushdown] Filtering deletes based on row counts
Spec
If metrics show that a delete file has no rows that match a scan predicate, it may be ignored just as a data file would be ignored [2].
For example, if
file_ahas rows withidbetween 1 and 10 and a delete file contains rows withidbetween 1 and 4, a scan forid = 9may ignore the delete file because none of the deletes can match a row that will be selected.
Thoughts
I think this is why a positional delete can contain an optional row column
| 2147483544 row | required struct<...> [1] | Deleted row values. Omit the column when not storing deleted rows. |
|---|
When present in the delete file,
rowisrequiredbecause all delete entries must include the row values.
The meaning of required here means that it can't be NULL