boto3
boto3 copied to clipboard
Significant CPU time on large numbers of dynamo records
Describe the bug
We noticed that a lot of time was spent in the _parse_shape method making when querying large amounts of records for dynamo db.
Steps to reproduce
With this patch we saw a 2x improvement in time with no changes in types to the response.
# Monkey Patch
from botocore.parsers import PROTOCOL_PARSERS
parser = PROTOCOL_PARSERS.get("json")
def new_fn(self, raw_body, shape):
return self._parse_body_as_json(raw_body)
setattr(parser, "_handle_json_body", new_fn)
If possible could we fix this issue by not using parse_shape for dynamo (if that does not cause errors) or enable users to disable it per client for improvements.
More info:
I wrote an article with my findings here https://maori.geek.nz/make-pythons-dynamodb-client-faster-with-this-one-simple-trick-2eb2888269ce
Hi @grahamjenson, thanks for the post. We'll take a look at your suggestion. (I edited it for clarity, the Markdown formatting wasn't applied quite right!)
Any updates?
any updates on this? when doing a batch get of 100 records I am seeing an improvement from 150ms to 60ms. not sure why this is a p2. this effectively makes it impossible to efficiently use batch_get while using boto3.
This is insane. For us loading JSON takes 4x more time then http into dynamodb.