aws-sdk-pandas
aws-sdk-pandas copied to clipboard
LOCK parameter in to_sql/copy
Currently using wr.redshift.to_sql/copy() to store data to redshift in an airflow pipeline, where multiple dags write to the same table.
I run into this serializable isolation error:
Serializable isolation violation on table
Using the lock=True parameter: the first dag instance gets stuck at the store data to redshift step, and the rest get held up until they all fail.
Any clue on handling concurrent transactions to redshift using wr?
Just to help narrow down the problem - what to_sql/copy mode are you using? Also, is lock=True for both? Would help if you could share the code
I used lock=True with both to_sql() and copy(), and mode="upsert"
The code was:
wr.redshift.copy(
df=data,
dtype=cast_dict,
con=conn,
schema=SCHEMA,
table=TABLE,
mode="upsert",
lock=True
primary_keys=PRIMARY_KEYS,
path=PATH
index=False,
)
@aeeladawy could you share stack trace as well please?
Closing due to inactivity