[core/iceberg] Added optional snapshot summary fields to iceberg metadata
Purpose
Linked issue: close #6502
This PR fixes Redshift Spectrum querying for Paimon tables with Iceberg compatibility by populating optional snapshot summary fields that are required by certain Iceberg query engines.
When Paimon generates Iceberg metadata, it currently only includes the operation field in snapshot summaries. While the Iceberg specification marks most summary fields as "optional," some query engines (notably AWS Redshift Spectrum) require fields like total-records to successfully parse and query tables.
This causes Paimon+Iceberg tables to be queryable in AWS Athena but fail in Redshift Spectrum with error: Required field total-records missing.
Changes
Added computeSnapshotSummary() Helper Method
Aggregates statistics from IcebergManifestFileMeta objects to compute snapshot-level metrics including:
Required fields (always present):
total-records- Total number of live recordstotal-data-files- Total number of live data filestotal-delete-files- Total number of live delete filestotal-position-deletes- Total position delete recordstotal-equality-deletes- Always "0" (Paimon doesn't use equality deletes)
Optional fields (when non-zero):
added-data-files,added-records,added-files-sizedeleted-data-files,deleted-records,deleted-files-sizetotal-files-sizechanged-partition-count
Tests
Updated IcebergMetadataTest.java
API and Format
N/A
Documentation
Reintroduces a feature that was previously available.
aws s3 cp s3://some-bucket/paimon/warehouse/somedb.db/some_table/metadata/v190.metadata.json - | jq '.snapshots[0].summary'
{
"added-data-files": "2",
"total-equality-deletes": "0",
"added-records": "83282",
"deleted-data-files": "0",
"deleted-records": "0",
"total-records": "83282",
"deleted-files-size": "0",
"changed-partition-count": "1",
"total-position-deletes": "0",
"added-files-size": "4683766",
"total-delete-files": "0",
"total-files-size": "4683766",
"total-data-files": "2",
"operation": "append"
}
Redshift Spectrum can now query the table.
@JingsongLi Can you take a look at your convenience please?