iceberg-python icon indicating copy to clipboard operation
iceberg-python copied to clipboard

Support Partial deletes

Open Fokko opened this issue 1 year ago • 0 comments

Feature Request / Improvement

Today we only support full deletes (overwrite). Supporting partial deletes can be achieved at several places in Iceberg:

  • Pure delete operations:
    • Deleting a manifest from the manifest list
    • Deleting a manifest entry from a manifest
  • Rewrite operation:
    • Rewriting existing Parquet files that contain rows that match the predicate

Pure metadata deletes can be achieved using https://github.com/apache/iceberg-python/pull/518 and https://github.com/apache/iceberg-python/pull/539. For the rewrite operations, we need an API to ensure we're pluggable for other query engines. This would entail a source path (including delete-on-read files), destination-path, and a residual filter that needs to be applied to the file.

Fokko avatar Mar 27 '24 14:03 Fokko