Define the API required for extending ducklake.
What function have to be provided to extend the ducklake extension with a new catalog / metadata / data inlining backend?
Assuming a new database extension is added. Can the ducklake extension use this new extension as soon as the implemented database can be attached in duckdb or are other endpoints required.
How is the configuration for the connection passed to the new extension?
I think, it would be good to define the API calls that are required to extend the ducklake extension.
All that is required is that a database extension is added, e.g. an extension similar to the Postgres/SQLite extensions. That extension needs to have support for a number of different statements, e.g.:
- CREATE/DROP/ALTER TABLE
- INSERT/UPDATE/DELETE
The set of SQL statements fired by the extension is defined here. Note that most of the SQL is executed DuckDB-side by default, although it is possible to push down statements into the remote database as well (either database-extension side, or in DuckLake explicitly as we do for e.g. Postgres in a limited capacity here).
@feanor12 do you have any further doubts? More documentation on the spec may come in the future
@guillesd I think I understand the required DQL DML DDL parts,
What I am missing is the details behind: "extension similar to the Postgres/SQLite extensions" Does that mean one has to implement a catalog? For example if I would look at nanodbc :
- There is a function to execute DQL
Select * from odbc_query( ... ) - A function to execture dml and ddl statements:
CALL odbc_exec(...) - And an option to attach a database tables as views using
CALL odbc_attach(...)
Would ducklake work after attaching the odbc tables as views? Calling odbc_query and so on is likly not a standard interface that is used by ducklake, so even though there is the functionallity the interface is likly not designed for the use in ducklake.
@feanor12
Does that mean one has to implement a catalog?
It needs to implement the Storage Extension, see example in duckdb-postgres.
Unfortunately this functionality is not yet exposed in C-API, hopefully can be added there soon.