duckdb_iceberg snowflake - cannot parse the metadata with duckdb v1.3.2

We found that duckdb 1.2 is able to properly parse the metadata snapshots created with snowflake, whereas duckdb v1.3.2 fails on the following document: https://gist.github.com/whatsthecraic/5b31954f9b94169559e6a4fe92de1ddb

The error returned by duckdb is:

D SELECT * FROM iceberg_scan('s3://bucket/path/to/metadata/pointer.json');
Invalid Input Error:
Object2 required property 'operation' is missing

The field operation is indeed missing in some of the snapshots.

The snapshots were created on snowflake with the following statements:

CREATE ICEBERG TABLE test1 ( A int ) CATALOG = 'snowflake' EXTERNAL_VOLUME = 'my_volume`';

// Insert some data
INSERT INTO test1 VALUES (10);
INSERT INTO test1 VALUES (10), (20);
INSERT INTO test1 VALUES (100), (200), (300);

// retrieve the metadata/root pointer
SELECT SYSTEM$GET_ICEBERG_TABLE_INFORMATION('test1');

Aug 08 '25 08:08 whatsthecraic

Hi @whatsthecraic,

Thank you for filing the issue! We will take a look and try to finish the iceberg read regardless of the operation type. We don't have a snowflake testing suite yet, which is why this has not been caught. Another reason is because operation is a required field according to the REST spec. This is a field Snowflake should add when creating snapshots

Aug 08 '25 12:08 Tmonster

This sounds like #374

To fix this, duckdb-avro needs to be able to read the toplevel metadata so we can act on the "format-version" in the file, rather than on the version in the metadata.json

Aug 13 '25 23:08 Tishj

Also just stumbled upon this issue while reading Iceberg metadata from Snowflake.

This seems similar to this pyiceberg issue https://github.com/apache/iceberg-python/issues/1106

Aug 29 '25 15:08 jonas-w

@Tmonster would you accept a PR where we do a similar change as in pyiceberg?

https://github.com/apache/iceberg-python/pull/1263

Aug 29 '25 16:08 nicornk

I stumbled on the issue and opened a support case with Snowflake

Sep 04 '25 16:09 florian-ernst-alan

This is the response I received - let's see how long the fix takes. "The engineering team has determined that the issue is caused by a serialization issue around history snapshots, which they are working on fixing."

Sep 05 '25 06:09 nicornk

@Tmonster would you accept a PR where we do a similar change as in pyiceberg? https://github.com/apache/iceberg-python/pull/1263

Hi @nicornk,

Yes, happy to accept a PR that will assume an operation value in the summary if it is missing 👍. DuckDB should be able to read as much of an iceberg table as it can if when possible

Oct 03 '25 11:10 Tmonster

hi @Tmonster , we have prepared a PR #524 Can you please take a look? Thanks

Oct 09 '25 08:10 nicornk

Can be closed

Nov 29 '25 12:11 nicornk