delta-rs
delta-rs copied to clipboard
LockClientError
Environment: PyPI deltalake 0.16.0
Delta-rs version: deltalake 0.16.0
Binding: Python
Environment:
- Cloud provider: AWS
- OS: Linux
- Other:
Bug
What happened:
I am using Lambda to write streaming data (from DynamoDB) to deltalake. Although I have disabled concurrent batches per shard, but I do not have full control on concurrent Lambda invocations, because Lambda polls each shard separately and may trigger another invocation within the same batch window. What I observed is, when there are concurrent Lambda invocations in the same second with each using write_deltalake
to write to the same table path in S3, I get LockClientError::VersionAlreadyExists(77). Error occured: Failed to commit transaction: Metadata changed since last commit.
And the particular batch of the data is not written to the table.
I was hoping Lock Provider could somehow get locks based on the full path including partition columns instead of securing lock at the table level, in which case this issue could be avoided.
Any suggestion how can I resolve this issue? Thanks!
What you expected to happen: The expectation is to be able to load the data successfully without failures.
How to reproduce it:
More details: