Reattach a ducklake with MySQL catalog
Hello,
Maybe I missed some information somewhere.
I set up a DuckLake with a MySQL catalog, everything seemed fine, but when I wan't to reattach it I run into those errors:
- Without specifying DATA_PATH:
ATTACH 'ducklake:mysql:db=ducklake_catalog host=192.168.1.100 user=root port=3306 password=xxx' AS myDucklake ; Invalid Input Error:
Attempting to create a new ducklake instance but data_path is not set - set the DATA_PATH parameter to the desired location of the data files
- Specifiying DATA_PATH
ATTACH 'ducklake:mysql:db=ducklake_catalog host=192.168.1.100 user=root port=3306 password=xxx' AS myDucklake (DATA_PATH 'data_files/');
IO Error: Failed to initialize DuckLake:Failed to run query "CREATE TABLEducklake_catalog.ducklake_metadata(keyTEXT NOT NULL,value TEXT NOT NULL);": Table 'ducklake_metadata' already exists
Are you sure that the database is not attached already?
You could try DETACH or SHOW databases; to find out.
Are you sure that the database is not attached already?
You could try
DETACHorSHOW databases;to find out.
Yes pretty much, also tried to close and open a new client before attaching the existing ducklake.
Make sure you don't have another duckdb session open in another terminal. I noticed that attachments are shared across all open :memory instances, which can definitely lead to some unexpected results. For example I was testing mysql and postgres catalogs in two separate terminals but used the same AS my_ducklake name for both attachments and they kept stepping on each other.
Hello, I have the same problem with DuckLake. Test spec:
- DuckDB CLI on Win11,
- DB Engine: MySQL(MariaDB) - 10.3.39-MariaDB-0ubuntu0.20.04.2 Ubuntu 20.04 Database Collation: utf8mb4_general_ci and Engine Type: InnoDB.
- Data files local on Win11
So after seeing this issue: https://github.com/duckdb/ducklake/issues/112
It seems that adding the METADATA_SCHEMA option solves the problem:
ATTACH 'ducklake:mysql:db=ducklake_catalog host=192.168.1.100 user=root port=3306 password=xxx' AS ducklake_catalog (DATA_PATH 'data_files/' , METADATA_SCHEMA 'ducklake_catalog');
I'm just wondering if this is normal and the relevant way to do this ?
It seems that the schema_name is not well resolve when mysql or postgresql is used in the following query : https://github.com/duckdb/ducklake/blob/main/src/storage/ducklake_initializer.cpp#L76
Indeed if the option METADATA_SCHEMA is specified it works, If it's the expected behaviour, the documentation would be updated....
using METADATA_SCHEMA 'ducklake_catalog' doesn't fix the ATTACH a second time issue for me I just now get
Failed to create DuckLake connection: IO Error: Failed to initialize DuckLake:Failed to run query "CREATE SCHEMA `ducklake_catalog`": Can't create database 'ducklake_catalog'; database exists
instead the of table already exists error
Any other ideas for a workaround ?
iled to initialize DuckLake:Failed to run query "CREATE SCHEMA
ducklake_catalog": Can't create database 'ducklake_catalog'; database exists
I did some tries and it's working well on my side (DuckDB v1.3.2, MySQL 8.0.42)
- Start a fresh cli and Init DuckLake with MySQL:
ATTACH 'ducklake:mysql:db=ducklake host=localhost' AS my_ducklake (DATA_PATH 'ducklake_data/')
- Stop Current cli and start a new one and attach existing ducklake :
ATTACH 'ducklake:mysql:db=ducklake host=localhost' AS my_ducklake (METADATA_SCHEMA 'ducklake')
Using METADATA_SCHEMA works, but how to specify METADATA_SCHEMA when directing attaching ducklake using cli?
duckdb ducklake:...
I'm using https://crates.io/crates/duckdb with an in memory duckdb. My first attach to mysql works, the second doesn't
Failed to create DuckLake connection: IO Error: Failed to initialize DuckLake:Failed to run query "CREATE SCHEMA `ducklake_catalog`": Can't create database 'ducklake_catalog'; database exists
I even used detach and drop of the connection before attempting a re-attach.
Using
METADATA_SCHEMAworks, but how to specify METADATA_SCHEMA when directing attaching ducklake using cli?
duckdb ducklake:...
For CLI, you can use the cmd option:
duckdb -cmd "ATTACH 'ducklake:mysql:db=ducklake host=localhost' AS my_ducklake (METADATA_SCHEMA 'ducklake')"
Is there a way to specify this when working with SQLMesh ? I tried adding metadata_schema key in the ducklake catalog but it just throws error:
Error: Invalid 'duckdb' connection config:
Invalid field 'catalogs.ducklake.str':
Input should be a valid string
Invalid field 'catalogs.ducklake.DuckDBAttachOptions.metadata_schema':
Extra inputs are not permitted