boto3 icon indicating copy to clipboard operation
boto3 copied to clipboard

Add ConditionExpression to dynamodb batch writer

Open mdavis-xyz opened this issue 4 years ago • 21 comments

I have a use case where I want to write a few dozen rows to dynamodb at a time, with conditions.

Use Case

But there's a certain edge case I'm trying to handle, where I'm trying to write two sets of data to the table which describe the same thing, but one is more recent (and therefore more accurate) than the other. So I need to use dynamodb conditions to say:

I can do this with a for loop over

response = table.put_item(
            Item=row,
            ReturnValues='ALL_OLD',
            ReturnConsumedCapacity='NONE',
            ConditionExpression=cond
        )

Including the server-side atomic logic, this is basically:

for new_row in rows:
   if new_row not already in table:
      put_item(new_row) # new
   elif existing_row['t'] < new_row['t']:
      put_item() # overwite because the new row is more recent
   else:
      pass
      # do nothing, because the data in the table is newer than what we're trying to write

This is pretty slow, because there's a separate HTTP call for each row.

I would like to use the batch writer for this, but it currently doesn't accept Conditions.

MWE

import boto3
from boto3.dynamodb.conditions import Key, Attr

table_name = 'test_batch_write'
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(table_name)

for delta in [0, 1]:
    with table.batch_writer() as batch:
        for i in range(10):
            val = i + delta * (i%2)
            cond = Attr('regionid').not_exists() & Attr('interval_end').not_exists() # row doesn't yet exist
            cond |= Attr('val').lt(val) # or val is lower
            print(f"Put {i}")
            batch.put_item(
                Item={
                    'top_timestamp': i,
                    'regionid': 'QLD1',
                    'val': val
                },
                ConditionExpression=cond)

Actual Result:

Traceback (most recent call last):
  File "batch_write.py", line 21, in <module>
    ConditionExpression=cond)
TypeError: put_item() got an unexpected keyword argument 'ConditionExpression'

Expected result:

Dynamodb contains 10 rows. For each row with an even i, the val column equals i which equals top_timestamp. For each other row, the val column equals i+1.

Note that with individual put calls, if the condition fails a boto3.client('dynamodb').exceptions.ConditionalCheckFailedException exception is throw which has to be caught inside each iteration over my for loop. I'm not sure what the best way to handle that for a batch write is.

mdavis-xyz avatar Sep 11 '20 02:09 mdavis-xyz

@mdavis-xyz - Thank you for the post. Marking this as a feature request for adding ConditionExpression to dynamodb batch writer.

swetashre avatar Sep 16 '20 20:09 swetashre

+1 for this feature, it would be hugely helpful in my work.

Took a quick look into this. It appears we need to add support for ConditionExpression into the batch_write_item method before it's possible to fully satisfy this request. Said method doesn't appear to be open-source, or at least I can't find it, otherwise I would have been happy to consider adding this feature myself.

rrhodes avatar Feb 02 '21 13:02 rrhodes

+1 for this idea. I have to update a table with incremental updates and this would allow me to avoid a whole bunch of client side logic to decide if a item already exists or not.

dazzag24 avatar Sep 03 '21 15:09 dazzag24

+1 for this idea, would simplify many use-cases.

rebelnn avatar Jan 10 '22 12:01 rebelnn

@swetashre, any ETA when this feature request may be worked on?

rrhodes avatar Jan 15 '22 13:01 rrhodes

any potential to bump this Feature request functionality? @swetashre

am1ru1 avatar Feb 06 '22 21:02 am1ru1

+1 for this idea. Would love this, and happy as a consumer to get a report of errored puts after batch completion to handle as needed.

mcphersonwhite avatar Aug 19 '22 17:08 mcphersonwhite

+1 this would be very helpful!

danielbender1989 avatar Oct 25 '22 10:10 danielbender1989

+1 I would like to see this possible too

sooslaca avatar Oct 27 '22 13:10 sooslaca

+1. I can easily make use of this to simplify my code.

JimBrennan3 avatar Jan 20 '23 18:01 JimBrennan3

+1 this would be super helpful to me also!

aleshkoa avatar Jan 26 '23 13:01 aleshkoa

+1, would be really useful!

ben-buitendijk avatar Feb 10 '23 09:02 ben-buitendijk

+1

gilvikra avatar May 17 '23 23:05 gilvikra

+1

ETisREAL avatar Jun 17 '23 10:06 ETisREAL

+1 , whn condition fail it can return the lit of tems that failed the condition pls take this as prio

bhanuunrivalled avatar Jul 18 '23 11:07 bhanuunrivalled

+1, High Prio

2023Learning avatar Jul 19 '23 12:07 2023Learning

@mdavis-xyz - Thank you for the post. Marking this as a feature request for adding ConditionExpression to dynamodb batch writer.

is there any possible feedback on this?

bhanuunrivalled avatar Jul 31 '23 10:07 bhanuunrivalled

+1, any response on this?

gabrieltardochi avatar Nov 08 '23 12:11 gabrieltardochi

+1... yet another multi-year dynamodb feature request long forgotten by aws team :(

egalvan10 avatar Apr 03 '24 04:04 egalvan10

+1.. this would make batch writer really useful

jayjb avatar May 06 '24 08:05 jayjb

the same issue is valid for delete_item()

TypeError: BatchWriter.delete_item() got an unexpected keyword argument 'ExpressionAttributeNames'

avoidik avatar Jun 06 '24 13:06 avoidik