aiobotocore icon indicating copy to clipboard operation
aiobotocore copied to clipboard

Adds support for checksums in streamed request trailers

Open terricain opened this issue 2 years ago • 9 comments

Description of Change

Adds support for S3 streaming uploads to support checksums in appended on the end of the request stream (trailers)

Running a snippet like the following will use streaming uploads and also the new SHA256 checksums

async def main():
    session = get_session()
    async with session.create_client('s3', region_name='eu-west-2') as s3_client:
        data = b'test1234'
        digest = hashlib.sha256(data).digest()

        resp = await s3_client.put_object(Bucket="BUCKET", Key='foobarbaz', Body=data, ChecksumAlgorithm='SHA256')
        sha256_trailer_checksum = base64.b64decode(resp['ChecksumSHA256'])

        assert digest == sha256_trailer_checksum

Note - when using Moto or an S3 endpoint not using HTTPS, the checksum will be added to the header, not the trailers as the headers are signed, the trailer is not (hence the requirements for HTTPS)

The problem comes with botocore.httpchecksum.AwsChunkedWrapper ending up as the body passed to aiohttp and it not liking it. We now subclass that wrapper class and add an async iterator implementation.

Fixes #954 Fixes https://github.com/terrycain/aioboto3/issues/265

Assumptions

That if "Transfer-Encoding": "Chunked" is always removed from the header dict before being passed to aiohttp is ok. (aiohttp reallllllly doesnt like it if you send it streaming body, and already set the transfer encoding :/ ). This ends up being injected from multiple places, mainly some of the checksum code as wells somewhere deep in the aws prepare_request() functions,


AWS Page - https://aws.amazon.com/blogs/aws/new-additional-checksum-algorithms-for-amazon-s3/

terricain avatar Aug 10 '22 22:08 terricain

@thehesiod am not too pleased with headers.pop('Transfer-Encoding', '') in httpsession.py you think theres a better way to do it?

terricain avatar Aug 10 '22 22:08 terricain

So when I run the S3 tests locally without the changes, I get the same number failed as with the changes, so I don't think they're related :)

terricain avatar Aug 10 '22 22:08 terricain

Yeah it looks like moto is unhappy for some reason. I ran the server standalone and pointed the tests at it and they seem to be getting 404'd at least on creating multipart uploads, even using boto3.


Update:

So if you downgrade Werkzeug to 2.1.2 and Flask to 2.1.3, all the tests pass, is related to https://github.com/spulec/moto/issues/5341

terricain avatar Aug 10 '22 23:08 terricain

thanks for all the research Terry! hopefully get to this soon, as it is getting that last release out it's 3am and I need to wake up at 7am, yowzers ;)

thehesiod avatar Aug 25 '22 10:08 thehesiod

looks like they pinned werkzeug + flask and I bumped us to the latest moto, may want to try with what we have and rebase to master

thehesiod avatar Aug 25 '22 10:08 thehesiod

just read a bit about this feature, very cool! This will solve the whole async file streaming support, since otherwise it didn't make much sense. I'm very interested in getting this working asap so will put further priority on reviewing this

thehesiod avatar Aug 25 '22 10:08 thehesiod

I could have used this feature several times in the last few years :)

thehesiod avatar Aug 25 '22 10:08 thehesiod

Codecov Report

Merging #962 (7eeb54a) into master (fbfeb12) will decrease coverage by 0.50%. The diff coverage is 45.45%.

@@            Coverage Diff             @@
##           master     #962      +/-   ##
==========================================
- Coverage   86.77%   86.26%   -0.51%     
==========================================
  Files          55       55              
  Lines        5322     5387      +65     
==========================================
+ Hits         4618     4647      +29     
- Misses        704      740      +36     
Flag Coverage Δ
unittests 86.26% <45.45%> (-0.51%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiobotocore/httpchecksum.py 28.57% <30.76%> (+3.57%) :arrow_up:
aiobotocore/client.py 86.26% <100.00%> (+0.07%) :arrow_up:
aiobotocore/httpsession.py 78.68% <100.00%> (+0.17%) :arrow_up:
tests/test_basic_s3.py 86.14% <100.00%> (+0.43%) :arrow_up:
tests/test_patches.py 96.07% <100.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov[bot] avatar Aug 25 '22 22:08 codecov[bot]

Yeah the coverage isnt going to be great as we can't test this unless moto runs with TLS. We could make that would though I think that should be done separately.

terricain avatar Aug 25 '22 23:08 terricain

hey @terrycain for the delay, been nuts. Hey some good news I got my first sponsor for this project, can I share a few bucks with you?

thehesiod avatar Oct 25 '22 19:10 thehesiod

Sure :D always up for some free money this is my paypal link else I've finally had the will to setup the github sponsors stuff :).

terricain avatar Nov 09 '22 17:11 terricain

I dislike being "that guy" but wanted to check in to see what needs to happen to get this in? Looking forward to this fix.

coolacid avatar Nov 21 '22 16:11 coolacid

@thehesiod can we get this merged ignoring codecov?

terricain avatar Nov 25 '22 00:11 terricain

hey sorry been nuts, got covid and perhaps RSV too from our kids one of my highest priorities is getting this in, really want to look at this in detail

thehesiod avatar Nov 28 '22 19:11 thehesiod

ok changes look good, just want to dig into the Transfer-Encoding issue

thehesiod avatar Nov 29 '22 05:11 thehesiod

ok figured out the chunk thing, and added support for running https tests, unfortunately moto doesn't appear to support the algs yet so still need to xfail for now.

thehesiod avatar Nov 29 '22 07:11 thehesiod

Codecov Report

Merging #962 (f0f911a) into master (551343c) will increase coverage by 0.12%. The diff coverage is 94.05%.

@@            Coverage Diff             @@
##           master     #962      +/-   ##
==========================================
+ Coverage   84.98%   85.11%   +0.12%     
==========================================
  Files          50       50              
  Lines        4529     4602      +73     
==========================================
+ Hits         3849     3917      +68     
- Misses        680      685       +5     
Flag Coverage Δ
unittests 85.11% <94.05%> (+0.12%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiobotocore/httpchecksum.py 65.85% <92.00%> (+40.85%) :arrow_up:
tests/test_basic_s3.py 85.95% <92.30%> (+0.23%) :arrow_up:
tests/mock_server.py 74.75% <95.45%> (ø)
aiobotocore/client.py 86.26% <100.00%> (+0.07%) :arrow_up:
aiobotocore/httpsession.py 79.20% <100.00%> (+0.68%) :arrow_up:
tests/conftest.py 91.95% <100.00%> (+0.08%) :arrow_up:
tests/moto_server.py 81.91% <100.00%> (+0.39%) :arrow_up:
tests/test_patches.py 96.07% <100.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov[bot] avatar Nov 29 '22 07:11 codecov[bot]

:tada:

terricain avatar Nov 29 '22 09:11 terricain

Confirmed this fixed setting checksum algorithm when using S3FS. Thanks!

coolacid avatar Nov 29 '22 14:11 coolacid