ducklake icon indicating copy to clipboard operation
ducklake copied to clipboard

Orphaned files on failed inserts

Open renre-axw opened this issue 5 months ago • 6 comments

Whilst performing an INSERT INTO FROM statement, the query apparently completes then when trying to COMMIT I get the following error.

100% ▕████████████████████████████████████████████████████████████▏ D COMMIT; TransactionContext Error: Failed to commit: Failed to execute query "ROLLBACK":

I am then left with orphaned files in the S3 store and running cleanup CALL ducklake_cleanup_old_files('ducklake', cleanup_all => true); does not address these.

renre-axw avatar Jul 17 '25 21:07 renre-axw

Thanks for the report!

Oprhaned files are inevitable if insertions are cancelled before interacting with the catalog. We plan to add a method to remove orphaned files in the near future.

That said, in this case it seems like a problem that should be resolvable. It would help if you have a full reproducer so we could investigate further.

Mytherin avatar Jul 28 '25 09:07 Mytherin

Thanks for opening this issue in the DuckLake issue tracker! To resolve this issue, our team needs a reproducible example. This includes:

  • A source code snippet which reproduces the issue.
  • The snippet should be self-contained, i.e., it should contain all imports and should use relative paths instead of hard coded paths (please avoid /Users/JohnDoe/...).
  • A lot of issues can be reproduced with plain SQL code executed in the DuckDB command line client. If you can provide such an example, it greatly simplifies the reproduction process and likely results in a faster fix.
  • If the script needs additional data, please share the data as a CSV, JSON, or Parquet file. Unfortunately, we cannot fix issues that can only be reproduced with a confidential data set. Support contracts allow sharing confidential data with the core DuckDB team under NDA.

For more detailed guidelines on how to create reproducible examples, please visit Stack Overflow's “Minimal, Reproducible Example” page.

duckdblabs-bot avatar Aug 04 '25 07:08 duckdblabs-bot

Hi @renre-axw if you have time to provide a reproducer I think this is a nice one to pick up!

guillesd avatar Aug 20 '25 11:08 guillesd

hi! I'm experiencing the same issues. In my case, I see it in extremely long inserts (~a few hours). It slowly climbs to 100% and then I see this exact message - Failed to commit: Failed to execute query "ROLLBACK":.

In the most recent case this is happening when a table is being created, so I wouldn't expect there to be any logical conflict with existing data. I'm not sure whether just other operations run on other tables at the same time might cause issues.

ironman5366 avatar Aug 27 '25 22:08 ironman5366

I experienced this issue whilst also performing a long insert. I would be unable to provide the data used but suspect I can create fake data of similar size/nature and hopefully reproduce. The setup was inserting records to S3, using an Amazon RDS Aurora DB for the Ducklake.

renre-axw avatar Aug 28 '25 09:08 renre-axw

Hi @renre-axw if you can provide a reproducer with a very long failed insert, please provide it!

guillesd avatar Sep 02 '25 08:09 guillesd