boto3 Significant CPU time on large numbers of dynamo records

Significant CPU time on large numbers of dynamo records

Open grahamjenson opened this issue 4 years ago • 4 comments

trafficstars

Describe the bug We noticed that a lot of time was spent in the _parse_shape method making when querying large amounts of records for dynamo db.

Steps to reproduce

With this patch we saw a 2x improvement in time with no changes in types to the response.

# Monkey Patch
from botocore.parsers import PROTOCOL_PARSERS
parser = PROTOCOL_PARSERS.get("json")
def new_fn(self, raw_body, shape):
    return self._parse_body_as_json(raw_body)
setattr(parser, "_handle_json_body", new_fn)

If possible could we fix this issue by not using parse_shape for dynamo (if that does not cause errors) or enable users to disable it per client for improvements.

More info:

I wrote an article with my findings here https://maori.geek.nz/make-pythons-dynamodb-client-faster-with-this-one-simple-trick-2eb2888269ce

Jul 14 '21 22:07 grahamjenson

Hi @grahamjenson, thanks for the post. We'll take a look at your suggestion. (I edited it for clarity, the Markdown formatting wasn't applied quite right!)

Jul 14 '21 22:07 kdaily

Any updates?

Apr 04 '22 19:04 dacevedo12

any updates on this? when doing a batch get of 100 records I am seeing an improvement from 150ms to 60ms. not sure why this is a p2. this effectively makes it impossible to efficiently use batch_get while using boto3.

Feb 18 '24 05:02 knap1930

This is insane. For us loading JSON takes 4x more time then http into dynamodb.

May 03 '24 10:05 mdrachuk

boto3 boto3 copied to clipboard

Significant CPU time on large numbers of dynamo records

boto3
boto3 copied to clipboard