botocore
botocore copied to clipboard
Urlencoding in SQS SendMessage is extremely expensive in CPU
Describe the bug
Using botocore to send messages to SQS can be very expensive in CPU, because of urlencoding the message body.
A. Urlencode is called twice, as stated in multiple issues in botocore, if I understand correctly once in the signature, and once in the preparation of the body of the request. B. The Urlencode implementation is extremely slow, so programs that are working with high throughput data can spend a lot of their time just urlencoding, which can be even more time consuming than the business logic itself.
We are using botocore in asyncio, with aiobotocore, and urlencode is blocking the eventloop making it practically unusable in high throuhgputs.
Steps to reproduce Send large messages with urlencode to an SQS queue.
Expected behavior Encoding shouldnt be the most expensive step of sending messages.
Profiling
Hi @yogevyuval, thanks for providing your feedback. We don’t support aiobotocore but I follow the reasoning behind your request.
You mentioned other botocore issues had brought up urlencode. Can you tell us which issues you were looking at?
Hi @yogevyuval, thanks for providing your feedback. We don’t support aiobotocore but I follow the reasoning behind your request.
You mentioned other botocore issues had brought up urlencode. Can you tell us which issues you were looking at?
As can be seen in https://github.com/boto/botocore/pull/1566
Urlencode is now called twice instead of many times. But if it could have been called once that would save half of the cpu time spent there
Thanks @yogevyuval I think that is a reasonable feature request and we can keep this issue open to track it.
Thanks @yogevyuval I think that is a reasonable feature request and we can keep this issue open to track it.
@tim-finnigan An update:
We patched AWSRequestPreparer._prepare_body with a faster rust-based implementation of url quoting (https://pypi.org/project/urlquote/), and experienced a 3X performance boost.
def patch_aws_request_urllib_parse():
def _fast_quote(value, *args, **kwargs) -> str:
return fast_quote(value, quoting=PYTHON_3_7_QUOTING).decode("utf-8")
def _fast_prepare_body(self, original):
"""Prepares the given HTTP body data."""
body = original.data
if body == b"":
body = None
if isinstance(body, dict):
params = [self._to_utf8(item) for item in body.items()]
body = urlencode(params, doseq=True, quote_via=_fast_quote)
return body
AWSRequestPreparer._prepare_body = _fast_prepare_body
Just chiming in to say that I've somewhat verified this with the following:
import io
import cProfile
from botocore.session import Session
from botocore.awsrequest import AWSResponse
class MockResponse(io.BytesIO):
def stream(self, *args, **kwargs):
yield self.read()
def stub(**kwargs):
raw_body = MockResponse(
b'<?xml version="1.0"?>'
b'<SendMessageResponse xmlns="http://queue.amazonaws.com/doc/2012-11-05/">'
b'<SendMessageResult><MessageId>eb8f0682-118a-4e63-b0b7-68337d38d962</MessageId>'
b'<MD5OfMessageBody>food18db4cc2f85cedef654fccc4a4d8</MD5OfMessageBody>'
b'</SendMessageResult>'
b'</SendMessageResponse>'
)
return AWSResponse('https://example.com', 200, {}, raw_body)
ses = Session()
client = ses.create_client('sqs')
client.meta.events.register('before-send', stub)
payload = 'a' * (1024 * 240)
with cProfile.Profile() as pr:
for _ in range(100):
r = client.send_message(
QueueUrl='...',
MessageBody=payload,
)
pr.dump_stats('t.prof')
This script removes networking as a factor where for large SQS message payloads (close to the max of 250 Kb) urlencode
takes about ~80% (~4% with networking for me) of the runtime.
The previous PR is still largely correct: to remove the duplicate preparation calls we'd need to do some refactoring around request "preparation" (a legacy concept from when were built on requests). It's worth noting this only applies to query
services, so one possible solution is to remove the notion of a dict
body entirely and instead have the serializer directly handle the conversion and produce a bytes body, which would reduce the cost of calling prepare and remove the duplicate urlencode
call. This would require that no logic post serialization is relying on the body being a dictionary for mutability purposes (I doubt this is the case, though).
Another possible solution is to get a little creative with caching request preparation but might be a little tricky and require some care to ensure we're not leaking anything (memory or between instances of prepared requests).
@jonemo @nateprewitt @tim-finnigan It seems that the latest announcement regarding JSON support will fix this issue which is great- any news on getting that into botocore and boto3?
Patched boto3 to use https://github.com/blue-yonder/urlquote, reduces cpu sigfinicantly
This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.