duckdb_iceberg
duckdb_iceberg copied to clipboard
Iceberg REST Catalog Support
Hey, team!
Very excited about the duckdb v0.9 support for iceberg!
I currently use a rest catalog for my iceberg tables and was hoping to be able to wire up duckdb to that rather than point it to the actual underlying data/metadata files.
If this is available, I'd love to use it -- otherwise, I'd be happy to jump in and start coding if this feature is new.
Thanks!
Hi @randypitcherii
Thanks for your interest! The iceberg extension is currently in a quite early stage. The REST catalog is not yet supported, so we are definitely interested in your help there! Feel free to reach out to me through the DuckDB discord for a chat!
Ok, no worries.
I'm thinking I'll chat with the rest catalog through python then get the details to my 🦆 db programatically.
I'll see you on the discord!!! Thanks!
Good morning @samansmink , is there any plan to support iceberg catalogs in general (not only REST) in the near future?
Thanks in advance.
Hey @thinkORo! I would love to, but I'm a bit low on time currently. In general i would say we would like to support the most used catalogs at some point, but I can not give any timeline here at the moment. If you are interested in contributing, I'm happy to help out though
Hi @samansmink ,
Unfortunately, I'm only really good at Data Management and Data Analytics. And Python. Therefore, I am only a very limited support in contributing to DuckDB.
But: If I can do something to increase the prioritization or support you elsewhere to give you more time for such an (really important, at least for me) implementation, I am happy to do so.
I have a framework in place for this if #51 gets merged, see the notes about the REST/Nessie catalog.
It should just be a few more lines of work to perform the HTTP request.
Up! Any updates on this?
Not yet -- been working on other things but will return to this soon.
please update once it is implemented. Really excited to see duckdb support to REST catalog in iceberg
While the combination of DuckDB <> PyArrow <> PyIceberg support covers this use-case, the extension is much more efficient than loading the data into PyTable. I would love to see the support for Iceberg catalogs.
This integration is super exciting. Any updates on when we might expect it to be available? Looking forward to trying it out.
This should be resolved now, correct?
- https://github.com/duckdb/duckdb-iceberg/pull/98
@derekperkins, we'll see if it will be included in the next release. But #98 sounds very, hmm, suitable to me. But it could be that @randypitcherii has write access in mind as well.
Hey this is great news!
Write would be lovely, but I personally just wanted read access through an iceberg catalog.
ESPECIALLY valuable would be if duck worked with the catalog to push down predicates to only pull back data required for a given query.
Does this help?
Redpanda would be interested in WRITE support to a DuckDB REST catalog as well