Added support for METADATA_SCHEMA in ducklake config.yml
Summary
Add support for the METADATA_SCHEMA configuration option for DuckLake catalogs in DuckDB connections. This allows users to specify which schema in the catalog server should store DuckLake metadata tables.
Changes
- Core Configuration (sqlmesh/core/config/connection.py):
- Added metadata_schema optional field to DuckDBAttachOptions
- Updated to_sql() method to include METADATA_SCHEMA parameter in the DuckDB ATTACH statement when specified
- Documentation (docs/integrations/engines/duckdb.md):
- Added metadata_schema to YAML and Python configuration examples for DuckLake catalogs
- Added comprehensive configuration options table documenting all DuckLake-specific parameters including the new metadata_schema option
- Tests (tests/core/test_connection_config.py):
- Added test_ducklake_metadata_schema() with three test cases:
- Verifies METADATA_SCHEMA is included in SQL when specified
- Confirms default behavior (no METADATA_SCHEMA) when not specified
- Tests interaction with other DuckLake options (data_path, encrypted)
- Added test_ducklake_metadata_schema() with three test cases:
Motivation
DuckLake users need the ability to control where metadata tables are stored within the catalog server. The default main schema may not always be appropriate for organizational or security requirements. This configuration option provides that flexibility while maintaining backward compatibility (defaults to DuckLake's default behavior when not specified).
Example Usage
gateways: my_gateway: connection: type: duckdb catalogs: ducklake_catalog: type: ducklake path: catalog.ducklake metadata_schema: custom_workspace
Test Plan
- Unit tests verify SQL generation with and without metadata_schema
- Unit tests confirm compatibility with other DuckLake options
- Documentation updated with examples
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
rmac seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.
METADATA_SCHEMA is documented in docs/integrations/engines/duckdb.md
https://github.com/TobikoData/sqlmesh/pull/5456/files#diff-de5de8bb701edca7af9847e114df48bfceefda5f991eaa196a2dfe765a5af9a3
Hey @justbry, sorry for the delay here. I was talking about official documentation around this flag; could you link it here? Also, can you please sign the CLA? Thanks!
Ducklake extension parameters for reference: https://ducklake.select/docs/stable/duckdb/usage/connecting#parameters
Is the only hold up of this being merged waiting on @justbry to sign the CLA? I'm in need of this update.
In addition to the last comment, it appears the user name and email address used for committing these changes are different to the GitHub user that pushed this PR.
Either please (1) recommit with an email address that is associated to the @justbry account or (2) add the email address for "rmac" to that account, as stated in the bot comment.
I signed in but had a different email in git on my new computer so it did not accept my signature. Sorry for the extra work this caused. Thanks for adding the support.