ducklake icon indicating copy to clipboard operation
ducklake copied to clipboard

Adding partition to unused table causes transaction conflicts

Open hlinander opened this issue 2 months ago • 6 comments

What happens?

When trying to set partitioning on a newly created empty table the transaction fails with conflict when there are concurrent writers into other tables.

CREATE TABLE lake.test (...);
ALTER TABLE lake.test SET PARTITIONED BY (...);
TransactionContext Error: Failed to commit: Failed to commit DuckLake transaction.
Exceeded the maximum retry count of 10 set by the ducklake_max_retry_count setting.
. Consider increasing the value with: e.g., "SET ducklake_max_retry_count = 100;"
Failed to write new snapshot to DuckLake: Duplicate key "snapshot_id: 1122" violates primary key constraint.

See the provided test code for reproducing using Python threads.

To Reproduce

(edit: updated example to both create and alter table at the same time)

import duckdb
import pandas as pd
import threading
import time
import os
import shutil
import random
import string

NUM_WRITER_THREADS = 100
WRITE_DURATION_SECONDS = 5

stop_event = threading.Event()

def setup_environment():
    if os.path.exists("./ducklake_generic_test"):
        shutil.rmtree("./ducklake_generic_test")
    os.makedirs("./ducklake_generic_test")

    con = duckdb.connect()
    con.sql("INSTALL ducklake;")
    con.sql("LOAD ducklake;")
    con.sql(f"ATTACH 'ducklake:./ducklake_generic_test/metadata.ducklake' AS generic_lake (DATA_PATH './ducklake_generic_test');")
    con.sql(f"USE generic_lake;")


    print("Creating separate writer table")
    con.sql("""
        CREATE TABLE writer_table (
            ts BIGINT,
            payload VARCHAR
        );
    """)
    con.close()

def writer_task(lake_connection: duckdb.DuckDBPyConnection, thread_id: int):
    cursor = lake_connection.cursor()
    print(f"[Writer {thread_id}] Started for table 'generic_lake.writer_table")
    while not stop_event.is_set():
        try:
            df = pd.DataFrame({
                'ts': [int(time.time() * 1000)],
                'payload': [''.join(random.choices(string.ascii_letters, k=10))]
            })
            cursor.execute("INSERT INTO generic_lake.writer_table SELECT * FROM df")
            time.sleep(random.uniform(0.001, 0.005))
        except duckdb.TransactionException:
            # Not interested in this right now
            pass 


def set_partition_task(lake_connection: duckdb.DuckDBPyConnection):
    cursor = lake_connection.cursor()
    print("[Partitioner] Attempting to execute 'CREATE TABLE ...; ALTER TABLE ... SET PARTITIONED BY'...")
    try:
        cursor.execute("""
            CREATE TABLE generic_lake.main_table (
                col1 INTEGER,
                col2 DECIMAL(10, 2),
                partition_key DATE
            );
            ALTER TABLE generic_lake.main_table SET PARTITIONED BY (partition_key);
        """)
        print("[Partitioner] SUCCESS: ALTER TABLE command completed.")
    except Exception as e:
        print(f">>> [Partitioner] FAILED to set partition: {e}")

if __name__ == "__main__":
    setup_environment()
    
    shared_connection = duckdb.connect()
    shared_connection.sql("INSTALL ducklake;")
    shared_connection.sql("LOAD ducklake;")
    shared_connection.sql(f"ATTACH 'ducklake:./ducklake_generic_test/metadata.ducklake' AS generic_lake (DATA_PATH './ducklake_generic_test');")
    shared_connection.sql(f"USE generic_lake;")

    threads = []

    for i in range(NUM_WRITER_THREADS):
        thread = threading.Thread(target=writer_task, args=(shared_connection, i))
        threads.append(thread)
        thread.start()

    time.sleep(0.5)

    partitioner_thread = threading.Thread(target=set_partition_task, args=(shared_connection,))
    threads.append(partitioner_thread)
    partitioner_thread.start()

    time.sleep(WRITE_DURATION_SECONDS)
    stop_event.set()

    for thread in threads:
        thread.join()
    
    shared_connection.close()

OS:

aarch64 OS X

DuckDB Version:

1.4.1

DuckLake Version:

45788f0

DuckDB Client:

CLI & Python

Hardware:

No response

Full Name:

Hampus Linander

Affiliation:

Hampus Linander

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

  • [x] Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • [x] Yes, I have

hlinander avatar Oct 13 '25 15:10 hlinander

Thanks for opening this issue in the DuckLake issue tracker! To resolve this issue, our team needs a reproducible example. This includes:

  • A source code snippet which reproduces the issue.
  • The snippet should be self-contained, i.e., it should contain all imports and should use relative paths instead of hard coded paths (please avoid /Users/JohnDoe/...).
  • A lot of issues can be reproduced with plain SQL code executed in the DuckDB command line client. If you can provide such an example, it greatly simplifies the reproduction process and likely results in a faster fix.
  • If the script needs additional data, please share the data as a CSV, JSON, or Parquet file. Unfortunately, we cannot fix issues that can only be reproduced with a confidential data set. Support contracts allow sharing confidential data with the core DuckDB team under NDA.

For more detailed guidelines on how to create reproducible examples, please visit Stack Overflow's “Minimal, Reproducible Example” page.

duckdblabs-bot avatar Oct 20 '25 11:10 duckdblabs-bot

Let me know what you need in addition to the current provided example for reproduction.

hlinander avatar Oct 20 '25 11:10 hlinander

hey @hlinander I did manage to reproduce this. The thing is this is a bit of a hit and miss because you can always tune retry (https://ducklake.select/docs/stable/duckdb/usage/configuration) and sometimes the operation will succeed and sometimes it won't. Two things that this issue brings up in my opinion:

  1. I think what you are doing in the repro is a bit of an anti-pattern. If you want to add a partition, you would do this right after the create table statement and not really later on.
  2. However, it is clear that when there are a bunch of connections (threads in your case) doing operations in the DuckLake, collisions on snapshot ids become a problem to the point where even with 100 retries per operation some operations may fail. If this happens with Python threads then for sure this will be worse in other languages. So this we need to take a look at.

guillesd avatar Oct 20 '25 11:10 guillesd

Yep, makes sense! I realise that this is probably an anti-pattern, but thought it might be interesting to raise the question as it looked like this might be something that should be able to be resolved automatically as the table being partitioned is completely disjoint from anything being mutated in other places.

"If you want to add a partition, you would do this right after the create table statement and not really later on." Note that this applies with the current example, i.e. we create a table that we then try to partition but this is blocked by the issue as reported.

hlinander avatar Oct 20 '25 12:10 hlinander

I meant right after you create the table (in your script this would be in the setup_environment function) however it doesn't matter because the issue is still valid!

This "as the table being partitioned is completely disjoint from anything being mutated in other places" is not technically true in DuckLake since snapshot ids are global. Meaning if you do an operation in a table completely unrelated to another one, it will still change the snapshot_id, which is shared in DuckLake to specify a valid state for the DuckLake in a specific moment in time. But ideally this shouldn't prevent your use case!

guillesd avatar Oct 20 '25 12:10 guillesd

I see! For completeness I updated the example to do both table creation and partitioning at the same time.

Thanks for the additional info.

hlinander avatar Oct 20 '25 13:10 hlinander