duckdb_iceberg icon indicating copy to clipboard operation
duckdb_iceberg copied to clipboard

[Predicate Pushdown] Filtering deletes based on row counts

Open Tishj opened this issue 5 months ago • 0 comments

Spec

If metrics show that a delete file has no rows that match a scan predicate, it may be ignored just as a data file would be ignored [2].

For example, if file_a has rows with id between 1 and 10 and a delete file contains rows with id between 1 and 4, a scan for id = 9 may ignore the delete file because none of the deletes can match a row that will be selected.

Thoughts

I think this is why a positional delete can contain an optional row column

2147483544 row required struct<...> [1] Deleted row values. Omit the column when not storing deleted rows.

When present in the delete file, row is required because all delete entries must include the row values.

The meaning of required here means that it can't be NULL

Tishj avatar Jun 27 '25 08:06 Tishj