ducklake icon indicating copy to clipboard operation
ducklake copied to clipboard

Error when migrating from ducklake 0.1 to 0.2

Open steven-luabase opened this issue 6 months ago • 3 comments

I recently upgraded ducklake from 0.1 to 0.2, as suggested in this issue to get partitions working: https://github.com/duckdb/ducklake/issues/231

However I'm hitting this error when the process is trying alter some of the catalog tables (I am using postgres as a catalog) to add columns needed for v0.2:

Failed to migrate DuckLake from v0.1 to v0.2:
Failed to execute query "ALTER TABLE "public"."ducklake_schema" ADD COLUMN  "path" VARCHAR": 
ERROR:  deadlock detected

Process 2072055 waits for AccessExclusiveLock on relation 49320 of database 49302; blocked by process 2072054.

Process 2072054 waits for AccessExclusiveLock on relation 49303 of database 49302; blocked by process 2072055.

There seems to be some circular references going on with the process.

steven-luabase avatar Jun 20 '25 19:06 steven-luabase

Saw that this was a similar issue https://github.com/duckdb/ducklake/issues/147

The error was slightly different though, as that one was related to columns already existing and the one I ran into is related to deadlock.

I wonder if it's a postgres catalog specific issue?

steven-luabase avatar Jun 20 '25 19:06 steven-luabase

I had a similar issue to this while upgrading ducklake. In my case, I got a migration error like above, except the issue was that I attempting to run operations from multiple threads. The first thread correctly performed the migration, and subsequent threads encountered failures because they tried to run the migration as well, but it had already been done.

In my case it was only a problem on the first run, however I think it points to a bigger issue: it seems quite dangerous to perform ducklake schema migrations transparently, given that this necessarily introduces side effects (the schema is altered), meanwhile ducklake/duckdb can't know what the global execution environment is.

Perhaps a safer approach would be to define interacting with a ducklake database that isn't at the version of the ducklake client to be an error. So if there are any major version changes to the client, e.g. those which need a schema change, then that client will simply inform the user that their ducklake schema needs an upgrade. For example:

D SELECT * FROM dl.some_table;
Error: Schema for 'dl' is incompatible with client version '0.2'. Use a ducklake client <= '0.1' or run 'CALL migration_command()' to perform the required migrations.

D CALL migration_command();

D SELECT * FROM dl.some_table;
...

There is probably a much better way to implement this, of course. Automatically performing migrations is convenient, but I think it's not great practice to silently mutate the user's database schema while doing a seemingly unrelated operation. Even more so if said mutation can cause the unrelated operations being executed by who knows who, who knows where, to fail in potentially transient/intermittent ways.

kyzyl avatar Jun 22 '25 23:06 kyzyl

My temporary workaround is to manually make the migration schema changes directly in the catalog (I'm using postgres) by running a slight variation of the migration sql found here:

ALTER TABLE public.ducklake_schema
  ADD COLUMN path TEXT DEFAULT '',
  ADD COLUMN path_is_relative BOOLEAN DEFAULT TRUE;

ALTER TABLE public.ducklake_table
  ADD COLUMN path TEXT DEFAULT '',
  ADD COLUMN path_is_relative BOOLEAN DEFAULT TRUE;

ALTER TABLE public.ducklake_metadata
  ADD COLUMN scope TEXT,
  ADD COLUMN scope_id BIGINT;

ALTER TABLE public.ducklake_data_file
  ADD COLUMN mapping_id BIGINT;

CREATE TABLE public.ducklake_column_mapping (
  mapping_id    BIGINT,
  table_id      BIGINT,
  "type"        TEXT
);

CREATE TABLE public.ducklake_name_mapping (
  mapping_id       BIGINT,
  column_id        BIGINT,
  source_name      TEXT,
  target_field_id  BIGINT,
  parent_column    BIGINT
);

UPDATE public.ducklake_partition_column AS pc
SET column_id = sub.a[pc.column_id + 1]
FROM (
  SELECT
    table_id,
    array_agg(column_id ORDER BY column_order) AS a
  FROM public.ducklake_column
  WHERE parent_column IS NULL
    AND end_snapshot  IS NULL
  GROUP BY table_id
) AS sub
WHERE sub.table_id = pc.table_id;

UPDATE public.ducklake_metadata
SET value = '0.2'
WHERE "key" = 'version';

It seems to have it worked as I can now run pipelines that are using ducklake 0.2.

steven-luabase avatar Jun 23 '25 17:06 steven-luabase