boto3
boto3 copied to clipboard
Add ConditionExpression to dynamodb batch writer
I have a use case where I want to write a few dozen rows to dynamodb at a time, with conditions.
Use Case
But there's a certain edge case I'm trying to handle, where I'm trying to write two sets of data to the table which describe the same thing, but one is more recent (and therefore more accurate) than the other. So I need to use dynamodb conditions to say:
I can do this with a for
loop over
response = table.put_item(
Item=row,
ReturnValues='ALL_OLD',
ReturnConsumedCapacity='NONE',
ConditionExpression=cond
)
Including the server-side atomic logic, this is basically:
for new_row in rows:
if new_row not already in table:
put_item(new_row) # new
elif existing_row['t'] < new_row['t']:
put_item() # overwite because the new row is more recent
else:
pass
# do nothing, because the data in the table is newer than what we're trying to write
This is pretty slow, because there's a separate HTTP call for each row.
I would like to use the batch writer for this, but it currently doesn't accept Conditions.
MWE
import boto3
from boto3.dynamodb.conditions import Key, Attr
table_name = 'test_batch_write'
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(table_name)
for delta in [0, 1]:
with table.batch_writer() as batch:
for i in range(10):
val = i + delta * (i%2)
cond = Attr('regionid').not_exists() & Attr('interval_end').not_exists() # row doesn't yet exist
cond |= Attr('val').lt(val) # or val is lower
print(f"Put {i}")
batch.put_item(
Item={
'top_timestamp': i,
'regionid': 'QLD1',
'val': val
},
ConditionExpression=cond)
Actual Result:
Traceback (most recent call last):
File "batch_write.py", line 21, in <module>
ConditionExpression=cond)
TypeError: put_item() got an unexpected keyword argument 'ConditionExpression'
Expected result:
Dynamodb contains 10 rows.
For each row with an even i, the val
column equals i
which equals top_timestamp
.
For each other row, the val
column equals i+1
.
Note that with individual put calls, if the condition fails a boto3.client('dynamodb').exceptions.ConditionalCheckFailedException
exception is throw which has to be caught inside each iteration over my for loop. I'm not sure what the best way to handle that for a batch write is.
@mdavis-xyz - Thank you for the post. Marking this as a feature request for adding ConditionExpression to dynamodb batch writer.
+1 for this feature, it would be hugely helpful in my work.
Took a quick look into this. It appears we need to add support for ConditionExpression
into the batch_write_item
method before it's possible to fully satisfy this request. Said method doesn't appear to be open-source, or at least I can't find it, otherwise I would have been happy to consider adding this feature myself.
+1 for this idea. I have to update a table with incremental updates and this would allow me to avoid a whole bunch of client side logic to decide if a item already exists or not.
+1 for this idea, would simplify many use-cases.
@swetashre, any ETA when this feature request may be worked on?
any potential to bump this Feature request functionality? @swetashre
+1 for this idea. Would love this, and happy as a consumer to get a report of errored puts after batch completion to handle as needed.
+1 this would be very helpful!
+1 I would like to see this possible too
+1. I can easily make use of this to simplify my code.
+1 this would be super helpful to me also!
+1, would be really useful!
+1
+1
+1 , whn condition fail it can return the lit of tems that failed the condition pls take this as prio
+1, High Prio
@mdavis-xyz - Thank you for the post. Marking this as a feature request for adding ConditionExpression to dynamodb batch writer.
is there any possible feedback on this?
+1, any response on this?
+1... yet another multi-year dynamodb feature request long forgotten by aws team :(
+1.. this would make batch writer really useful
the same issue is valid for delete_item()
TypeError: BatchWriter.delete_item() got an unexpected keyword argument 'ExpressionAttributeNames'