aiohttp icon indicating copy to clipboard operation
aiohttp copied to clipboard

aiohttp does not skip response body when a HEAD request response has a body when using C extensions

Open jonathon-love opened this issue 10 months ago • 20 comments

Describe the bug

aiohttp barfs on a dropbox download, that other software (i.e. curl) doesn't seem to have difficulty with.

To Reproduce

import aiohttp
import asyncio

async def download_file(url):
    async with aiohttp.ClientSession() as session:
        async with session.head(url, allow_redirects=True) as resp:
            pass

url = 'https://www.dropbox.com/scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1'

asyncio.run(download_file(url))

Expected behavior

200 OK

Logs/tracebacks

Traceback (most recent call last):
  File "/Users/XXXX/Library/Caches/pypoetry/virtualenvs/fred-L1l92eZt-py3.12/lib/python3.12/site-packages/aiohttp/client_proto.py", line 263, in data_received
    messages, upgraded, tail = self._parser.feed_data(data)
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "aiohttp/_http_parser.pyx", line 558, in aiohttp._http_parser.HttpParser.feed_data
aiohttp.http_exceptions.BadHttpMessage: 400, message:
  Invalid character in chunk size:

    b'\x1f\x8b\x08'
      ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/XXXX/Library/Caches/pypoetry/virtualenvs/fred-L1l92eZt-py3.12/lib/python3.12/site-packages/aiohttp/client_reqrep.py", line 1059, in start
    message, payload = await protocol.read()  # type: ignore[union-attr]
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/XXXX/Library/Caches/pypoetry/virtualenvs/fred-L1l92eZt-py3.12/lib/python3.12/site-packages/aiohttp/streams.py", line 671, in read
    await self._waiter
aiohttp.http_exceptions.HttpProcessingError: 400, message:
  Invalid character in chunk size:

    b'\x1f\x8b\x08'
      ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "<stdin>", line 3, in download_file
  File "/Users/XXXX/Library/Caches/pypoetry/virtualenvs/fred-L1l92eZt-py3.12/lib/python3.12/site-packages/aiohttp/client.py", line 1425, in __aenter__
    self._resp: _RetType = await self._coro
                           ^^^^^^^^^^^^^^^^
  File "/Users/XXXX/Library/Caches/pypoetry/virtualenvs/fred-L1l92eZt-py3.12/lib/python3.12/site-packages/aiohttp/client.py", line 730, in _request
    await resp.start(conn)
  File "/Users/XXXX/Library/Caches/pypoetry/virtualenvs/fred-L1l92eZt-py3.12/lib/python3.12/site-packages/aiohttp/client_reqrep.py", line 1061, in start
    raise ClientResponseError(
aiohttp.client_exceptions.ClientResponseError: 400, message="Invalid character in chunk size:\n\n  b'\\x1f\\x8b\\x08'\n    ^", url='https://uc7f6f4bf01d26ef6c48853d3c4c.dl.dropboxusercontent.com/cd/0/get/CiHGZDofY40BymZ9BQFqKYP2T_NQroYSU7UaqEPQtQDkAYFB_GCSpSAvxX-j1cFVIVHx0qSE5QFz-cJ5cNrnweRXgVIyXn04tg3k_asEek_G5zxbLThKClANeLOqWVCZKB5o3tm7u0TEiWx3aiwqB3-A/file?dl=1'


### Python Version

```console
$ python --version

`Python 3.12.3`

aiohttp Version

$ python -m pip show aiohttp

`Version: 3.11.11`

multidict Version

$ python -m pip show multidict

Version: 6.1.0

propcache Version

$ python -m pip show propcache

0.2.1

yarl Version

$ python -m pip show yarl

1.18.3

OS

linux, macOS

Related component

Client

Additional context

No response

Code of Conduct

  • [X] I agree to follow the aio-libs Code of Conduct

jonathon-love avatar Jan 13 '25 03:01 jonathon-love

Is there a redirect involved? Might be due to recoding... The traceback shows the server responding with HTTP 400.

webknjaz avatar Jan 13 '25 10:01 webknjaz

yes redirects in play

% curl -I -L -v "https://www.dropbox.com/scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1"
* Host www.dropbox.com:443 was resolved.
* IPv6: (none)
* IPv4: 162.125.83.18
*   Trying 162.125.83.18:443...
* Connected to www.dropbox.com (162.125.83.18) port 443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
* (304) (IN), TLS handshake, Unknown (8):
* (304) (IN), TLS handshake, Certificate (11):
* (304) (IN), TLS handshake, CERT verify (15):
* (304) (IN), TLS handshake, Finished (20):
* (304) (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256 / [blank] / UNDEF
* ALPN: server accepted h2
* Server certificate:
*  subject: C=US; ST=California; L=San Francisco; O=Dropbox, Inc; CN=*.dropbox.com
*  start date: Nov 12 00:00:00 2024 GMT
*  expire date: Dec  8 23:59:59 2025 GMT
*  subjectAltName: host "www.dropbox.com" matched cert's "*.dropbox.com"
*  issuer: C=US; O=DigiCert Inc; CN=DigiCert TLS RSA SHA256 2020 CA1
*  SSL certificate verify ok.
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://www.dropbox.com/scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1
* [HTTP/2] [1] [:method: HEAD]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: www.dropbox.com]
* [HTTP/2] [1] [:path: /scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1]
* [HTTP/2] [1] [user-agent: curl/8.7.1]
* [HTTP/2] [1] [accept: */*]
> HEAD /scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1 HTTP/2
> Host: www.dropbox.com
> User-Agent: curl/8.7.1
> Accept: */*
> 
* Request completely sent off
< HTTP/2 302 
HTTP/2 302 
< content-security-policy: script-src 'unsafe-eval' 'inline-speculation-rules' https://www.dropbox.com/static/api/ https://www.dropbox.com/pithos/* https://www.dropbox.com/page_success/ https://cfl.dropboxstatic.com/static/ https://www.dropboxstatic.com/static/ https://accounts.google.com/gsi/client https://canny.io/sdk.js https://www.paypal.com/sdk/js https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/ 'unsafe-inline' ; font-src https://* data: ; worker-src https://www.dropbox.com/static/serviceworker/ https://www.dropbox.com/encrypted_folder_download/service_worker.js https://www.dropbox.com/service_worker.js blob: ; report-uri https://www.dropbox.com/csp_log?policy_name=metaserver-whitelist ; child-src https://www.dropbox.com/static/serviceworker/ blob: ; frame-ancestors 'self' https://*.dropbox.com ; form-action https://docs.google.com/document/fsip/ https://docs.google.com/spreadsheets/fsip/ https://docs.google.com/presentation/fsip/ https://docs.sandbox.google.com/document/fsip/ https://docs.sandbox.google.com/spreadsheets/fsip/ https://docs.sandbox.google.com/presentation/fsip/ https://*.purple.officeapps.live-int.com https://officeapps-df.live.com https://*.officeapps-df.live.com https://officeapps.live.com https://*.officeapps.live.com https://paper.dropbox.com/cloud-docs/edit 'self' https://www.dropbox.com/ https://dl-web.dropbox.com/ https://photos.dropbox.com/ https://paper.dropbox.com/ https://showcase.dropbox.com/ https://www.hellofax.com/ https://app.hellofax.com/ https://www.hellosign.com/ https://app.hellosign.com/ https://docsend.com/ https://www.docsend.com/ https://help.dropbox.com/ https://navi.dropbox.jp/ https://a.sprig.com/ https://selfguidedlearning.dropboxbusiness.com/ https://instructorledlearning.dropboxbusiness.com/ https://sales.dropboxbusiness.com/ https://accounts.google.com/ https://api.login.yahoo.com/ https://login.yahoo.com/ https://experience.dropbox.com/ https://pal-test.adyen.com https://2e83413d8036243b-Dropbox-pal-live.adyenpayments.com/ https://onedrive.live.com/picker ; frame-src https://* carousel: dbapi-6: dbapi-7: dbapi-8: dropbox-client: itms-apps: itms-appss: ; style-src https://* 'unsafe-inline' 'unsafe-eval' ; default-src https://www.dropbox.com/playlist/ https://www.dropbox.com/v/s/playlist/ https://*.dropboxusercontent.com/p/hls_master_playlist/ https://*.dropboxusercontent.com/p/hls_playlist/ ; img-src https://* data: blob: ; connect-src https://* ws://127.0.0.1:*/ws blob: wss://dsimports.dropbox.com/ ; object-src 'self' https://cfl.dropboxstatic.com/static/ https://www.dropboxstatic.com/static/ ; base-uri 'self' ; media-src https://* blob:
content-security-policy: script-src 'unsafe-eval' 'inline-speculation-rules' https://www.dropbox.com/static/api/ https://www.dropbox.com/pithos/* https://www.dropbox.com/page_success/ https://cfl.dropboxstatic.com/static/ https://www.dropboxstatic.com/static/ https://accounts.google.com/gsi/client https://canny.io/sdk.js https://www.paypal.com/sdk/js https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/ 'unsafe-inline' ; font-src https://* data: ; worker-src https://www.dropbox.com/static/serviceworker/ https://www.dropbox.com/encrypted_folder_download/service_worker.js https://www.dropbox.com/service_worker.js blob: ; report-uri https://www.dropbox.com/csp_log?policy_name=metaserver-whitelist ; child-src https://www.dropbox.com/static/serviceworker/ blob: ; frame-ancestors 'self' https://*.dropbox.com ; form-action https://docs.google.com/document/fsip/ https://docs.google.com/spreadsheets/fsip/ https://docs.google.com/presentation/fsip/ https://docs.sandbox.google.com/document/fsip/ https://docs.sandbox.google.com/spreadsheets/fsip/ https://docs.sandbox.google.com/presentation/fsip/ https://*.purple.officeapps.live-int.com https://officeapps-df.live.com https://*.officeapps-df.live.com https://officeapps.live.com https://*.officeapps.live.com https://paper.dropbox.com/cloud-docs/edit 'self' https://www.dropbox.com/ https://dl-web.dropbox.com/ https://photos.dropbox.com/ https://paper.dropbox.com/ https://showcase.dropbox.com/ https://www.hellofax.com/ https://app.hellofax.com/ https://www.hellosign.com/ https://app.hellosign.com/ https://docsend.com/ https://www.docsend.com/ https://help.dropbox.com/ https://navi.dropbox.jp/ https://a.sprig.com/ https://selfguidedlearning.dropboxbusiness.com/ https://instructorledlearning.dropboxbusiness.com/ https://sales.dropboxbusiness.com/ https://accounts.google.com/ https://api.login.yahoo.com/ https://login.yahoo.com/ https://experience.dropbox.com/ https://pal-test.adyen.com https://2e83413d8036243b-Dropbox-pal-live.adyenpayments.com/ https://onedrive.live.com/picker ; frame-src https://* carousel: dbapi-6: dbapi-7: dbapi-8: dropbox-client: itms-apps: itms-appss: ; style-src https://* 'unsafe-inline' 'unsafe-eval' ; default-src https://www.dropbox.com/playlist/ https://www.dropbox.com/v/s/playlist/ https://*.dropboxusercontent.com/p/hls_master_playlist/ https://*.dropboxusercontent.com/p/hls_playlist/ ; img-src https://* data: blob: ; connect-src https://* ws://127.0.0.1:*/ws blob: wss://dsimports.dropbox.com/ ; object-src 'self' https://cfl.dropboxstatic.com/static/ https://www.dropboxstatic.com/static/ ; base-uri 'self' ; media-src https://* blob:
< content-type: text/html; charset=utf-8
content-type: text/html; charset=utf-8
< location: https://uceb23753f12718e072510c60961.dl.dropboxusercontent.com/cd/0/inline/CiEi3aKPJlVEgxJk_vuIcEeDmlUWSwdfUmGT12QWH6FVl4cO7QlWxwgx7wUMJuv58Hz5Q65Hou5q-TdP3P42qHEM5cQJAM6oAO2cbc0o4Zi_A-jjsjZ6xCtED4CGBPB6XA4/file?dl=1#
location: https://uceb23753f12718e072510c60961.dl.dropboxusercontent.com/cd/0/inline/CiEi3aKPJlVEgxJk_vuIcEeDmlUWSwdfUmGT12QWH6FVl4cO7QlWxwgx7wUMJuv58Hz5Q65Hou5q-TdP3P42qHEM5cQJAM6oAO2cbc0o4Zi_A-jjsjZ6xCtED4CGBPB6XA4/file?dl=1#
< pragma: no-cache
pragma: no-cache
< referrer-policy: strict-origin-when-cross-origin
referrer-policy: strict-origin-when-cross-origin
< set-cookie: gvc=ODUyODM5Mjc5ODY1MTk1ODYyNDEwOTI5ODU4NjU1NTIyOTU4NA==; Path=/; Expires=Sat, 12 Jan 2030 10:37:17 GMT; HttpOnly; Secure; SameSite=None
set-cookie: gvc=ODUyODM5Mjc5ODY1MTk1ODYyNDEwOTI5ODU4NjU1NTIyOTU4NA==; Path=/; Expires=Sat, 12 Jan 2030 10:37:17 GMT; HttpOnly; Secure; SameSite=None
< set-cookie: t=Sh_yiBM3_zZ94XnO_KsivosI; Path=/; Domain=dropbox.com; Expires=Tue, 13 Jan 2026 10:37:17 GMT; HttpOnly; Secure; SameSite=None
set-cookie: t=Sh_yiBM3_zZ94XnO_KsivosI; Path=/; Domain=dropbox.com; Expires=Tue, 13 Jan 2026 10:37:17 GMT; HttpOnly; Secure; SameSite=None
< set-cookie: __Host-js_csrf=Sh_yiBM3_zZ94XnO_KsivosI; Path=/; Expires=Tue, 13 Jan 2026 10:37:17 GMT; Secure; SameSite=None
set-cookie: __Host-js_csrf=Sh_yiBM3_zZ94XnO_KsivosI; Path=/; Expires=Tue, 13 Jan 2026 10:37:17 GMT; Secure; SameSite=None
< set-cookie: __Host-ss=4WDPeTzG94; Path=/; Expires=Tue, 13 Jan 2026 10:37:17 GMT; HttpOnly; Secure; SameSite=Strict
set-cookie: __Host-ss=4WDPeTzG94; Path=/; Expires=Tue, 13 Jan 2026 10:37:17 GMT; HttpOnly; Secure; SameSite=Strict
< set-cookie: locale=en; Path=/; Domain=dropbox.com; Expires=Sat, 12 Jan 2030 10:37:17 GMT
set-cookie: locale=en; Path=/; Domain=dropbox.com; Expires=Sat, 12 Jan 2030 10:37:17 GMT
< x-content-type-options: nosniff
x-content-type-options: nosniff
< x-permitted-cross-domain-policies: none
x-permitted-cross-domain-policies: none
< x-robots-tag: noindex, nofollow, noimageindex
x-robots-tag: noindex, nofollow, noimageindex
< x-xss-protection: 1; mode=block
x-xss-protection: 1; mode=block
< content-length: 17
content-length: 17
< date: Mon, 13 Jan 2025 10:37:18 GMT
date: Mon, 13 Jan 2025 10:37:18 GMT
< strict-transport-security: max-age=31536000; includeSubDomains
strict-transport-security: max-age=31536000; includeSubDomains
< server: envoy
server: envoy
< cache-control: no-cache, no-store
cache-control: no-cache, no-store
< x-dropbox-response-origin: far_remote
x-dropbox-response-origin: far_remote
< x-dropbox-request-id: bd026e63f07b461c861a0141d529217a
x-dropbox-request-id: bd026e63f07b461c861a0141d529217a
< 

* Ignoring the response-body
* Connection #0 to host www.dropbox.com left intact
* Issue another request to this URL: 'https://uceb23753f12718e072510c60961.dl.dropboxusercontent.com/cd/0/inline/CiEi3aKPJlVEgxJk_vuIcEeDmlUWSwdfUmGT12QWH6FVl4cO7QlWxwgx7wUMJuv58Hz5Q65Hou5q-TdP3P42qHEM5cQJAM6oAO2cbc0o4Zi_A-jjsjZ6xCtED4CGBPB6XA4/file?dl=1'
* Host uceb23753f12718e072510c60961.dl.dropboxusercontent.com:443 was resolved.
* IPv6: (none)
* IPv4: 162.125.83.15
*   Trying 162.125.83.15:443...
* Connected to uceb23753f12718e072510c60961.dl.dropboxusercontent.com (162.125.83.15) port 443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
* (304) (IN), TLS handshake, Server hello (2):
* (304) (IN), TLS handshake, Unknown (8):
* (304) (IN), TLS handshake, Certificate (11):
* (304) (IN), TLS handshake, CERT verify (15):
* (304) (IN), TLS handshake, Finished (20):
* (304) (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256 / [blank] / UNDEF
* ALPN: server accepted h2
* Server certificate:
*  subject: C=US; ST=California; L=San Francisco; O=Dropbox, Inc; CN=*.dl.dropboxusercontent.com
*  start date: Mar 25 00:00:00 2024 GMT
*  expire date: Mar 11 23:59:59 2025 GMT
*  subjectAltName: host "uceb23753f12718e072510c60961.dl.dropboxusercontent.com" matched cert's "*.dl.dropboxusercontent.com"
*  issuer: C=US; O=DigiCert Inc; CN=DigiCert TLS RSA SHA256 2020 CA1
*  SSL certificate verify ok.
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://uceb23753f12718e072510c60961.dl.dropboxusercontent.com/cd/0/inline/CiEi3aKPJlVEgxJk_vuIcEeDmlUWSwdfUmGT12QWH6FVl4cO7QlWxwgx7wUMJuv58Hz5Q65Hou5q-TdP3P42qHEM5cQJAM6oAO2cbc0o4Zi_A-jjsjZ6xCtED4CGBPB6XA4/file?dl=1
* [HTTP/2] [1] [:method: HEAD]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: uceb23753f12718e072510c60961.dl.dropboxusercontent.com]
* [HTTP/2] [1] [:path: /cd/0/inline/CiEi3aKPJlVEgxJk_vuIcEeDmlUWSwdfUmGT12QWH6FVl4cO7QlWxwgx7wUMJuv58Hz5Q65Hou5q-TdP3P42qHEM5cQJAM6oAO2cbc0o4Zi_A-jjsjZ6xCtED4CGBPB6XA4/file?dl=1]
* [HTTP/2] [1] [user-agent: curl/8.7.1]
* [HTTP/2] [1] [accept: */*]
> HEAD /cd/0/inline/CiEi3aKPJlVEgxJk_vuIcEeDmlUWSwdfUmGT12QWH6FVl4cO7QlWxwgx7wUMJuv58Hz5Q65Hou5q-TdP3P42qHEM5cQJAM6oAO2cbc0o4Zi_A-jjsjZ6xCtED4CGBPB6XA4/file?dl=1 HTTP/2
> Host: uceb23753f12718e072510c60961.dl.dropboxusercontent.com
> User-Agent: curl/8.7.1
> Accept: */*
> 
* Request completely sent off
< HTTP/2 302 
HTTP/2 302 
< content-type: application/json
content-type: application/json
< cache-control: no-cache
cache-control: no-cache
< content-security-policy: sandbox
content-security-policy: sandbox
< etag: 1736727306872417d
etag: 1736727306872417d
< location: /cd/0/inline2/CiEPuxccYngED2Y3lcJZd276KCSjpCzeSCIsG5o9cIs0MSQjqCVqS12zUm53ZvO6XHAtOS24G-prrElYBkBH74_Au0-jnmUGs-x9b-zm7O1wxp5Dr_v65NG04iP7RBcX_zakuvgZN_BXEeAaLQ2TtjZJSuC4Slwtw5SOX4YzE1UMkaULWIOPUv0ujgPvNOVZ6c5oZgYcrpeg8yAE46e5gKYQ-FU1s7nlXGSNAT1zIa4T-pwuHvV7ADoWFIc9MBFviQGDdWFeUq2u5Xgx8wPas2FppZ7mIQfffsIFbBpjb_o4Y83ofGk2GWD6UbqiwZaFVqlNqIdw4IUCa181c-52veI8a4LGXV9vUmz8_Ogetpn2zQ/file?dl=1
location: /cd/0/inline2/CiEPuxccYngED2Y3lcJZd276KCSjpCzeSCIsG5o9cIs0MSQjqCVqS12zUm53ZvO6XHAtOS24G-prrElYBkBH74_Au0-jnmUGs-x9b-zm7O1wxp5Dr_v65NG04iP7RBcX_zakuvgZN_BXEeAaLQ2TtjZJSuC4Slwtw5SOX4YzE1UMkaULWIOPUv0ujgPvNOVZ6c5oZgYcrpeg8yAE46e5gKYQ-FU1s7nlXGSNAT1zIa4T-pwuHvV7ADoWFIc9MBFviQGDdWFeUq2u5Xgx8wPas2FppZ7mIQfffsIFbBpjb_o4Y83ofGk2GWD6UbqiwZaFVqlNqIdw4IUCa181c-52veI8a4LGXV9vUmz8_Ogetpn2zQ/file?dl=1
< referrer-policy: no-referrer
referrer-policy: no-referrer
< set-cookie:  uc_session=bCkHnAr1AtATJck2rXB9JJ1dFnejeUFMPoOG3zyTqNjJTCmmuVgsU2CnJorvk8lG; Domain=dropboxusercontent.com; HttpOnly; Path=/; SameSite=None; Secure
set-cookie:  uc_session=bCkHnAr1AtATJck2rXB9JJ1dFnejeUFMPoOG3zyTqNjJTCmmuVgsU2CnJorvk8lG; Domain=dropboxusercontent.com; HttpOnly; Path=/; SameSite=None; Secure
< vary: Origin, Accept-Encoding
vary: Origin, Accept-Encoding
< x-robots-tag: noindex, nofollow, noimageindex
x-robots-tag: noindex, nofollow, noimageindex
< date: Mon, 13 Jan 2025 10:37:18 GMT
date: Mon, 13 Jan 2025 10:37:18 GMT
< server: envoy
server: envoy
< strict-transport-security: max-age=31536000; includeSubDomains; preload
strict-transport-security: max-age=31536000; includeSubDomains; preload
< x-dropbox-response-origin: far_remote
x-dropbox-response-origin: far_remote
< x-dropbox-request-id: df1e5c7fd8874d38bd1ab5b931da2536
x-dropbox-request-id: df1e5c7fd8874d38bd1ab5b931da2536
< 

* Ignoring the response-body
* Connection #1 to host uceb23753f12718e072510c60961.dl.dropboxusercontent.com left intact
* Issue another request to this URL: 'https://uceb23753f12718e072510c60961.dl.dropboxusercontent.com/cd/0/inline2/CiEPuxccYngED2Y3lcJZd276KCSjpCzeSCIsG5o9cIs0MSQjqCVqS12zUm53ZvO6XHAtOS24G-prrElYBkBH74_Au0-jnmUGs-x9b-zm7O1wxp5Dr_v65NG04iP7RBcX_zakuvgZN_BXEeAaLQ2TtjZJSuC4Slwtw5SOX4YzE1UMkaULWIOPUv0ujgPvNOVZ6c5oZgYcrpeg8yAE46e5gKYQ-FU1s7nlXGSNAT1zIa4T-pwuHvV7ADoWFIc9MBFviQGDdWFeUq2u5Xgx8wPas2FppZ7mIQfffsIFbBpjb_o4Y83ofGk2GWD6UbqiwZaFVqlNqIdw4IUCa181c-52veI8a4LGXV9vUmz8_Ogetpn2zQ/file?dl=1'
* Found bundle for host: 0x600002705d40 [can multiplex]
* Re-using existing connection with host uceb23753f12718e072510c60961.dl.dropboxusercontent.com
* [HTTP/2] [3] OPENED stream for https://uceb23753f12718e072510c60961.dl.dropboxusercontent.com/cd/0/inline2/CiEPuxccYngED2Y3lcJZd276KCSjpCzeSCIsG5o9cIs0MSQjqCVqS12zUm53ZvO6XHAtOS24G-prrElYBkBH74_Au0-jnmUGs-x9b-zm7O1wxp5Dr_v65NG04iP7RBcX_zakuvgZN_BXEeAaLQ2TtjZJSuC4Slwtw5SOX4YzE1UMkaULWIOPUv0ujgPvNOVZ6c5oZgYcrpeg8yAE46e5gKYQ-FU1s7nlXGSNAT1zIa4T-pwuHvV7ADoWFIc9MBFviQGDdWFeUq2u5Xgx8wPas2FppZ7mIQfffsIFbBpjb_o4Y83ofGk2GWD6UbqiwZaFVqlNqIdw4IUCa181c-52veI8a4LGXV9vUmz8_Ogetpn2zQ/file?dl=1
* [HTTP/2] [3] [:method: HEAD]
* [HTTP/2] [3] [:scheme: https]
* [HTTP/2] [3] [:authority: uceb23753f12718e072510c60961.dl.dropboxusercontent.com]
* [HTTP/2] [3] [:path: /cd/0/inline2/CiEPuxccYngED2Y3lcJZd276KCSjpCzeSCIsG5o9cIs0MSQjqCVqS12zUm53ZvO6XHAtOS24G-prrElYBkBH74_Au0-jnmUGs-x9b-zm7O1wxp5Dr_v65NG04iP7RBcX_zakuvgZN_BXEeAaLQ2TtjZJSuC4Slwtw5SOX4YzE1UMkaULWIOPUv0ujgPvNOVZ6c5oZgYcrpeg8yAE46e5gKYQ-FU1s7nlXGSNAT1zIa4T-pwuHvV7ADoWFIc9MBFviQGDdWFeUq2u5Xgx8wPas2FppZ7mIQfffsIFbBpjb_o4Y83ofGk2GWD6UbqiwZaFVqlNqIdw4IUCa181c-52veI8a4LGXV9vUmz8_Ogetpn2zQ/file?dl=1]
* [HTTP/2] [3] [user-agent: curl/8.7.1]
* [HTTP/2] [3] [accept: */*]
> HEAD /cd/0/inline2/CiEPuxccYngED2Y3lcJZd276KCSjpCzeSCIsG5o9cIs0MSQjqCVqS12zUm53ZvO6XHAtOS24G-prrElYBkBH74_Au0-jnmUGs-x9b-zm7O1wxp5Dr_v65NG04iP7RBcX_zakuvgZN_BXEeAaLQ2TtjZJSuC4Slwtw5SOX4YzE1UMkaULWIOPUv0ujgPvNOVZ6c5oZgYcrpeg8yAE46e5gKYQ-FU1s7nlXGSNAT1zIa4T-pwuHvV7ADoWFIc9MBFviQGDdWFeUq2u5Xgx8wPas2FppZ7mIQfffsIFbBpjb_o4Y83ofGk2GWD6UbqiwZaFVqlNqIdw4IUCa181c-52veI8a4LGXV9vUmz8_Ogetpn2zQ/file?dl=1 HTTP/2
> Host: uceb23753f12718e072510c60961.dl.dropboxusercontent.com
> User-Agent: curl/8.7.1
> Accept: */*
> 
* Request completely sent off
< HTTP/2 200 
HTTP/2 200 
< content-type: application/json
content-type: application/json
< accept-ranges: bytes
accept-ranges: bytes
< cache-control: max-age=60
cache-control: max-age=60
< content-disposition: attachment; filename=unspecified
content-disposition: attachment; filename=unspecified
< content-security-policy: sandbox
content-security-policy: sandbox
< content-security-policy: report-uri https://www.dropbox.com/csp_log?policy_name=blockserver-usercontent ; sandbox allow-forms allow-scripts allow-top-navigation allow-popups
content-security-policy: report-uri https://www.dropbox.com/csp_log?policy_name=blockserver-usercontent ; sandbox allow-forms allow-scripts allow-top-navigation allow-popups
< content-security-policy: form-action 'none' ; report-uri https://www.dropbox.com/csp_log?policy_name=blockserver-noscript ; script-src 'none'
content-security-policy: form-action 'none' ; report-uri https://www.dropbox.com/csp_log?policy_name=blockserver-noscript ; script-src 'none'
< etag: 1736727306872417d
etag: 1736727306872417d
< pragma: public
pragma: public
< referrer-policy: no-referrer
referrer-policy: no-referrer
< set-cookie:  uc_session=6XxzAhBTSoFY5pgkaN63ZY50rYJq2ZY8a0dsXRwxXNAX2q0CH2z5cMIEvtd0Ek08; Domain=dropboxusercontent.com; HttpOnly; Path=/; SameSite=None; Secure
set-cookie:  uc_session=6XxzAhBTSoFY5pgkaN63ZY50rYJq2ZY8a0dsXRwxXNAX2q0CH2z5cMIEvtd0Ek08; Domain=dropboxusercontent.com; HttpOnly; Path=/; SameSite=None; Secure
< vary: Origin, Accept-Encoding
vary: Origin, Accept-Encoding
< x-content-security-policy: sandbox
x-content-security-policy: sandbox
< x-content-type-options: nosniff
x-content-type-options: nosniff
< x-robots-tag: noindex, nofollow, noimageindex
x-robots-tag: noindex, nofollow, noimageindex
< x-server-response-time: 187
x-server-response-time: 187
< x-webkit-csp: sandbox
x-webkit-csp: sandbox
< date: Mon, 13 Jan 2025 10:37:18 GMT
date: Mon, 13 Jan 2025 10:37:18 GMT
< server: envoy
server: envoy
< strict-transport-security: max-age=31536000; includeSubDomains; preload
strict-transport-security: max-age=31536000; includeSubDomains; preload
< content-length: 2146
content-length: 2146
< x-dropbox-response-origin: far_remote
x-dropbox-response-origin: far_remote
< x-dropbox-request-id: da6721bade384c63b8a156a1f059b964
x-dropbox-request-id: da6721bade384c63b8a156a1f059b964
< 

jonathon-love avatar Jan 13 '25 10:01 jonathon-love

Try passing requote_redirect_url=False when initializing the client session: https://docs.aiohttp.org/en/stable/client_reference.html#aiohttp.ClientSession.requote_redirect_url.

You may also disable the redirects and walk them manually, verifying that the Location header value is quoted appropriately.

webknjaz avatar Jan 13 '25 10:01 webknjaz

@jonathon-love also note that curl goes for HTTP/2 which aiohttp won't upgrade. That's another difference.

webknjaz avatar Jan 13 '25 10:01 webknjaz

thanks for your help.

adding requote_redirect_url=False doesn't help:

async def download_file(url):
    async with aiohttp.ClientSession(requote_redirect_url=False) as session:
        async with session.head(url, allow_redirects=True) as resp:
            pass

url = 'https://www.dropbox.com/scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1'

asyncio.run(download_file(url))

and the addition of --http1.1 to curl continues to work, i.e.

curl -I -L -v --http1.1 "https://www.dropbox.com/scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1"

one thing worth noting is that this is a .head() call, rather than a .get().

        async with session.head(url, allow_redirects=True) as resp:
            pass

if i change this to a .get() call it all works as expected.

with thanks

jonathon-love avatar Jan 14 '25 00:01 jonathon-love

Is the issue present with AIOHTTP_NO_EXTENSIONS=1?

I'll try and reproduce in a couple of days.

Dreamsorcerer avatar Jan 14 '25 01:01 Dreamsorcerer

it works when AIOHTTP_NO_EXTENSIONS=1 is set!

jonathon-love avatar Jan 14 '25 01:01 jonathon-love

I wonder if there's something we're missing, to tell the C parser that it is a HEAD response. I suspect it's trying to parse a full response..

Dreamsorcerer avatar Jan 14 '25 01:01 Dreamsorcerer

I tried to reproduce this but I'm not seeing a failure

bdraco@MacBook-Pro-37 aiohttp % python -m pip show aiohttp
Name: aiohttp
Version: 3.11.11
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author: 
Author-email: 
License: Apache-2.0
Location: /opt/homebrew/lib/python3.13/site-packages
Requires: aiohappyeyeballs, aiosignal, attrs, frozenlist, multidict, propcache, yarl
Required-by: aioharmony, aiohttp-asyncmdnsresolver, aioresponses, aioshelly, govee-api-laggat, nexia, pytest-aiohttp, python-kasa, snitun
bdraco@MacBook-Pro-37 aiohttp % python -m pip show yarl
Name: yarl
Version: 1.18.3
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl
Author: Andrew Svetlov
Author-email: [email protected]
License: Apache-2.0
Location: /opt/homebrew/lib/python3.13/site-packages
Requires: idna, multidict, propcache
Required-by: aiohttp, aioshelly, onvif-zeep-async
bdraco@MacBook-Pro-37 aiohttp % cat down.py 
import aiohttp
import asyncio

async def download_file(url):
    async with aiohttp.ClientSession() as session:
        async with session.head(url, allow_redirects=True) as resp:
            print(resp.status)

url = 'https://www.dropbox.com/scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1'

asyncio.run(download_file(url))
bdraco@MacBook-Pro-37 aiohttp % python3 down.py           
200
bdraco@MacBook-Pro-37 aiohttp % 

bdraco avatar Mar 17 '25 00:03 bdraco

still reproducible for me.

(fred-py3.13) c3113592@CCW210-9M0XYPP aiohttp % python -m pip show aiohttp   
Name: aiohttp
Version: 3.11.14
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author: 
Author-email: 
License: Apache-2.0
Location: /Users/c3113592/Library/Caches/pypoetry/virtualenvs/fred-PQy0OPEg-py3.13/lib/python3.13/site-packages
Requires: aiohappyeyeballs, aiosignal, attrs, frozenlist, multidict, propcache, yarl
Required-by: 
(fred-py3.13) c3113592@CCW210-9M0XYPP aiohttp % python3 down.py           
Traceback (most recent call last):
  File "/Users/c3113592/Library/Caches/pypoetry/virtualenvs/fred-PQy0OPEg-py3.13/lib/python3.13/site-packages/aiohttp/client_proto.py", line 264, in data_received
    messages, upgraded, tail = self._parser.feed_data(data)
                               ~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "aiohttp/_http_parser.pyx", line 558, in aiohttp._http_parser.HttpParser.feed_data
aiohttp.http_exceptions.BadHttpMessage: 400, message:
  Invalid character in chunk size:

    b'\x1f\x8b\x08'
      ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/c3113592/Library/Caches/pypoetry/virtualenvs/fred-PQy0OPEg-py3.13/lib/python3.13/site-packages/aiohttp/client_reqrep.py", line 1059, in start
    message, payload = await protocol.read()  # type: ignore[union-attr]
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/c3113592/Library/Caches/pypoetry/virtualenvs/fred-PQy0OPEg-py3.13/lib/python3.13/site-packages/aiohttp/streams.py", line 672, in read
    await self._waiter
aiohttp.http_exceptions.HttpProcessingError: 400, message:
  Invalid character in chunk size:

    b'\x1f\x8b\x08'
      ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/c3113592/Downloads/aiohttp/down.py", line 10, in <module>
    asyncio.run(download_file(url))
    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ~~~~~~~~~~^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/base_events.py", line 720, in run_until_complete
    return future.result()
           ~~~~~~~~~~~~~^^
  File "/Users/c3113592/Downloads/aiohttp/down.py", line 6, in download_file
    async with session.head(url, allow_redirects=True) as resp:
               ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/c3113592/Library/Caches/pypoetry/virtualenvs/fred-PQy0OPEg-py3.13/lib/python3.13/site-packages/aiohttp/client.py", line 1425, in __aenter__
    self._resp: _RetType = await self._coro
                           ^^^^^^^^^^^^^^^^
  File "/Users/c3113592/Library/Caches/pypoetry/virtualenvs/fred-PQy0OPEg-py3.13/lib/python3.13/site-packages/aiohttp/client.py", line 730, in _request
    await resp.start(conn)
  File "/Users/c3113592/Library/Caches/pypoetry/virtualenvs/fred-PQy0OPEg-py3.13/lib/python3.13/site-packages/aiohttp/client_reqrep.py", line 1061, in start
    raise ClientResponseError(
    ...<5 lines>...
    ) from exc
aiohttp.client_exceptions.ClientResponseError: 400, message="Invalid character in chunk size:\n\n  b'\\x1f\\x8b\\x08'\n    ^", url='https://ucbabf37dbb083a5658d25841818.dl.dropboxusercontent.com/cd/0/get/CmADiigmRvYWgWUOajNrs0T3vE5W8M9cuQ2Rd6sYEUx29W76vPs1lb1SKq4LRo3fkCIQYv3HgzR-64cmX-1SSZfCe7fI6NXlRTZBY4KIwmbTC5_nOV4-OgHzq4R8nv6l1KayTE1rPe5LqBaMAeR5A7RR/file?dl=1'

it works if i set AIOHTTP_NO_EXTENSIONS=1 (or AIOHTTP_NO_EXTENSIONS=0 for that matter)

jonathon

jonathon-love avatar Mar 17 '25 03:03 jonathon-love

Let me try running it in another directory just in case something is leaking across in my dev setup

bdraco avatar Mar 17 '25 03:03 bdraco

I can reproduce it when I do it in another directory

bdraco@MacBook-Pro-37 NEW_DIR % python3 -m pip show aiohttp
Name: aiohttp
Version: 3.11.14
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author: 
Author-email: 
License: Apache-2.0
Location: /opt/homebrew/lib/python3.13/site-packages
Requires: aiohappyeyeballs, aiosignal, attrs, frozenlist, multidict, propcache, yarl
Required-by: aioharmony, aiohttp-asyncmdnsresolver, aioresponses, aioshelly, govee-api-laggat, nexia, pytest-aiohttp, python-kasa, snitun
bdraco@MacBook-Pro-37 NEW_DIR % cat issue_10322.py 
import aiohttp
import asyncio

async def download_file(url):
    async with aiohttp.ClientSession() as session:
        async with session.head(url, allow_redirects=True) as resp:
            print(resp.status)

url = 'https://www.dropbox.com/scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8udmlxsosn&st=2owyrsrs&dl=1'

asyncio.run(download_file(url))
bdraco@MacBook-Pro-37 NEW_DIR % python3 issue_10322.py 
Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.13/site-packages/aiohttp/client_proto.py", line 264, in data_received
    messages, upgraded, tail = self._parser.feed_data(data)
                               ~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "aiohttp/_http_parser.pyx", line 558, in aiohttp._http_parser.HttpParser.feed_data
aiohttp.http_exceptions.BadHttpMessage: 400, message:
  Invalid character in chunk size:

    b'\x1f\x8b\x08'
      ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.13/site-packages/aiohttp/client_reqrep.py", line 1059, in start
    message, payload = await protocol.read()  # type: ignore[union-attr]
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.13/site-packages/aiohttp/streams.py", line 672, in read
    await self._waiter
aiohttp.http_exceptions.HttpProcessingError: 400, message:
  Invalid character in chunk size:

    b'\x1f\x8b\x08'
      ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/bdraco/NEW_DIR/issue_10322.py", line 11, in <module>
    asyncio.run(download_file(url))
    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.13.2/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ~~~~~~~~~~^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.13.2/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.13.2/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/base_events.py", line 725, in run_until_complete
    return future.result()
           ~~~~~~~~~~~~~^^
  File "/Users/bdraco/NEW_DIR/issue_10322.py", line 6, in download_file
    async with session.head(url, allow_redirects=True) as resp:
               ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.13/site-packages/aiohttp/client.py", line 1425, in __aenter__
    self._resp: _RetType = await self._coro
                           ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.13/site-packages/aiohttp/client.py", line 730, in _request
    await resp.start(conn)
  File "/opt/homebrew/lib/python3.13/site-packages/aiohttp/client_reqrep.py", line 1061, in start
    raise ClientResponseError(
    ...<5 lines>...
    ) from exc
aiohttp.client_exceptions.ClientResponseError: 400, message="Invalid character in chunk size:\n\n  b'\\x1f\\x8b\\x08'\n    ^", url='https://ucaffda43d36f7f2a3e4ec1034fb.dl.dropboxusercontent.com/cd/0/get/CmCv4cA8lE1F9e5DnpD8_diG0sDToNANIf8jHKHrxvsfsdDIKgXJURqhhK3B6mUNc3IhWlovgsPQJauoHaBPAg9u7oGpMUomsSloSGS50gj7VfF_AGiUb1nd7j95qFAmh4e29T_mh9ZgtO-BGgfRO6sJ/file?dl=1'
bdraco@MacBook-Pro-37 NEW_DIR % 

bdraco avatar Mar 17 '25 03:03 bdraco

Here is the transaction

['->',
 b'HEAD /scl/fi/818zs3kiois9qkv2axv9t/Tooth-Growth.omv?rlkey=f4m2lv069py0ezn8ud'
 b'mlxsosn&st=2owyrsrs&dl=1 HTTP/1.1\r\nHost: www.dropbox.com\r\nAccept: */'
 b'*\r\nAccept-Encoding: gzip, deflate, br\r\nUser-Agent: Python/3.13 aiohttp/3'
 b'.11.15.dev0\r\n\r\n']
['<-',
 b'HTTP/1.1 302 Found\r\nContent-Security-Policy: frame-src https://* carouse'
 b'l: dbapi-6: dbapi-7: dbapi-8: dropbox-client: itms-apps: itms-appss: blob: ;'
 b" base-uri 'self' ; default-src https://www.dropbox.com/playlist/ https://www"
 b'.dropbox.com/v/s/playlist/ https://*.dropboxusercontent.com/p/hls_master_pla'
 b'ylist/ https://*.dropboxusercontent.com/p/hls_playlist/ ; connect-src https:'
 b"//* ws://127.0.0.1:*/ws blob: wss://dsimports.dropbox.com/ ; script-src 'uns"
 b"afe-eval' 'inline-speculation-rules' https://www.dropbox.com/static/api/ htt"
 b'ps://www.dropbox.com/pithos/ https://cfl.dropboxstatic.com/static/ https://w'
 b'ww.dropboxstatic.com/static/ https://accounts.google.com/gsi/client https://'
 b'canny.io/sdk.js https://www.paypal.com/sdk/js https://www.google.com/recaptc'
 b"ha/ https://www.gstatic.com/recaptcha/ 'unsafe-inline' ; child-src https://w"
 b'ww.dropbox.com/static/serviceworker/ blob: ; report-uri https://www.dropbox.'
 b"com/csp_log?policy_name=metaserver-whitelist ; object-src 'self' https://cfl"
 b'.dropboxstatic.com/static/ https://www.dropboxstatic.com/static/ ; worker-sr'
 b'c https://www.dropbox.com/static/serviceworker/ https://www.dropbox.com/encr'
 b'ypted_folder_download/service_worker.js https://www.dropbox.com/service_work'
 b'er.js blob: ; media-src https://* blob: ; form-action https://docs.google.co'
 b'm/document/fsip/ https://docs.google.com/spreadsheets/fsip/ https://docs.goo'
 b'gle.com/presentation/fsip/ https://docs.sandbox.google.com/document/fsip/ ht'
 b'tps://docs.sandbox.google.com/spreadsheets/fsip/ https://docs.sandbox.google'
 b'.com/presentation/fsip/ https://*.purple.officeapps.live-int.com https://off'
 b'iceapps-df.live.com https://*.officeapps-df.live.com https://officeapps.live'
 b'.com https://*.officeapps.live.com https://paper.dropbox.com/cloud-docs/edit'
 b" 'self' https://www.dropbox.com/ https://dl-web.dropbox.com/ https://photos."
 b'dropbox.com/ https://paper.dropbox.com/ https://showcase.dropbox.com/ https:'
 b'//www.hellofax.com/ https://app.hellofax.com/ https://www.hellosign.com/ htt'
 b'ps://app.hellosign.com/ https://docsend.com/ https://www.docsend.com/ https:'
 b'//help.dropbox.com/ https://navi.dropbox.jp/ https://a.sprig.com/ https://se'
 b'lfguidedlearning.dropboxbusiness.com/ https://instructorledlearning.dropboxb'
 b'usiness.com/ https://sales.dropboxbusiness.com/ https://accounts.google.com/'
 b' https://api.login.yahoo.com/ https://login.yahoo.com/ https://experience.dr'
 b'opbox.com/ https://pal-test.adyen.com https://2e83413d8036243b-Dropbox-pal-l'
 b"ive.adyenpayments.com/ https://onedrive.live.com/picker ; frame-ancestors 's"
 b"elf' https://*.dropbox.com ; style-src https://* 'unsafe-inline' 'unsafe-eva"
 b"l' ; font-src https://* data: ; img-src https://* data: blob:\r\nContent-T"
 b'ype: text/html; charset=utf-8\r\nLocation: https://uc97cdcb1e623b4ad263e12'
 b'14499.dl.dropboxusercontent.com/cd/0/get/CmDw-aWRq7E0XBjxPV96OzeiNRFgdjbRxji'
 b'rg7Y7i8Je3lRcaI4-JoPGCg0Fz4PYZ3kKHlHe4me7l0s24_HSfWMfwqHztXRsZIPSJafmlGC8Elc'
 b'mCpmI1bQIb7HetVsbHkkZpKiQlrUFLVfi08Cpi9VT/file?dl=1#\r\nPragma: no-cache\r\n'
 b'Referrer-Policy: strict-origin-when-cross-origin\r\nSet-Cookie: gvc=NzE2NT'
 b'MxMjg2MjYzNDk0OTY5MDY0NTA3NTIwNTU5MzczMzkyMTU=; Path=/; Expires=Sat, 16 Mar '
 b'2030 03:26:05 GMT; HttpOnly; Secure; SameSite=None\r\nSet-Cookie: t=siVqvo'
 b'_-BTNavohLU2HJr4L5; Path=/; Domain=dropbox.com; Expires=Tue, 17 Mar 2026 03:'
 b'26:05 GMT; HttpOnly; Secure; SameSite=None\r\nSet-Cookie: __Host-js_csrf=s'
 b'iVqvo_-BTNavohLU2HJr4L5; Path=/; Expires=Tue, 17 Mar 2026 03:26:05 GMT; Secu'
 b're; SameSite=None\r\nSet-Cookie: __Host-ss=GH9qslTt60; Path=/; Expires=Tue'
 b', 17 Mar 2026 03:26:05 GMT; HttpOnly; Secure; SameSite=Strict\r\nSet-Cooki'
 b'e: locale=en; Path=/; Domain=dropbox.com; Expires=Sat, 16 Mar 2030 03:26:05 '
 b'GMT\r\nX-Content-Type-Options: nosniff\r\nX-Permitted-Cross-Domain-Policies:'
 b' none\r\nX-Robots-Tag: noindex, nofollow, noimageindex\r\nX-Xss-Protection: '
 b'1; mode=block\r\nContent-Length: 17\r\nDate: Mon, 17 Mar 2025 03:26:06 G'
 b'MT\r\nStrict-Transport-Security: max-age=31536000; includeSubDomains\r\nServ'
 b'er: envoy\r\nCache-Control: no-cache, no-store\r\nX-Dropbox-Response-Origin:'
 b' far_remote\r\nX-Dropbox-Request-Id: d5979a6a499749ba9f0cf759344ca4a0\r'
 b'\n\r\n']
['->',
 b'HEAD /cd/0/get/CmDw-aWRq7E0XBjxPV96OzeiNRFgdjbRxjirg7Y7i8Je3lRcaI4-JoPGCg0Fz'
 b'4PYZ3kKHlHe4me7l0s24_HSfWMfwqHztXRsZIPSJafmlGC8ElcmCpmI1bQIb7HetVsbHkkZpKiQl'
 b'rUFLVfi08Cpi9VT/file?dl=1 HTTP/1.1\r\nHost: uc97cdcb1e623b4ad263e1214499.d'
 b'l.dropboxusercontent.com\r\nAccept: */*\r\nAccept-Encoding: gzip, deflate, b'
 b'r\r\nUser-Agent: Python/3.13 aiohttp/3.11.15.dev0\r\n\r\n']
['<-',
 b'HTTP/1.1 200 OK\r\nContent-Type: application/json\r\nAccept-Ranges: byte'
 b's\r\nCache-Control: max-age=60\r\nContent-Disposition: attachment; filename='
 b'"Tooth Growth.omv"; filename*=UTF-8\'\'Tooth%20Growth.omv\r\nContent-Securit'
 b'y-Policy: sandbox\r\nPragma: public\r\nReferrer-Policy: no-referrer\r\nVar'
 b'y: Origin, Accept-Encoding\r\nX-Content-Security-Policy: sandbox\r\nX-Conten'
 b't-Type-Options: nosniff\r\nX-Robots-Tag: noindex, nofollow, noimageindex\r\n'
 b'X-Server-Response-Time: 384\r\nX-Webkit-Csp: sandbox\r\nDate: Mon, 17 Mar 20'
 b'25 03:26:06 GMT\r\nServer: envoy\r\nStrict-Transport-Security: max-age=31536'
 b'000; includeSubDomains; preload\r\nContent-Encoding: gzip\r\nX-Dropbox-Respo'
 b'nse-Origin: far_remote\r\nX-Dropbox-Request-Id: 26c0a8d34e4b478c8d8dd39bc1'
 b'f429dc\r\nTransfer-Encoding: chunked\r\n\r\n\x1f\x8b\x08\x00\x00\x00'
 b'\x00\x00\x00\x03\x03\x00\x00\x00\x00\x00\x00\x00\x00\x00']

bdraco avatar Mar 17 '25 03:03 bdraco

so dropbox is sending a response body on a HEAD request.

curl hint: * Ignoring the response-body

bdraco avatar Mar 17 '25 03:03 bdraco

love your work!

jonathon-love avatar Mar 17 '25 03:03 jonathon-love

https://datatracker.ietf.org/doc/html/rfc9112#section-6.3-2.1 is pretty clear that its not allowed to have a body:

Any response to a HEAD request and any response with a 1xx (Informational), 204 (No Content), or 304 (Not Modified) status code is always terminated by the first empty line after the header fields, regardless of the header fields present in the message, and thus cannot contain a message body or trailer section.

aiohttp is uses llhttp for the c parser https://github.com/nodejs/llhttp

It seems like curl is more forgiving.

I think there could be an argument made to be that llhttp should be as forgiving as curl and discard the unexpected body. I don't think there are any security or request smuggling implications to doing that (someone else needs to validate this statement). However thats something the llhttp maintainers would need to decide. I'd suggest continuing at https://github.com/nodejs/llhttp/issues?q=sort%3Aupdated-desc+is%3Aissue+is%3Aopen and if they decide to implement being forgiving about this violation we can update the version of llhttp we bundle. I should note that if they decide to implement it in their lenient mode we probably can't use it because the security implications of turning that on by default.

bdraco avatar Mar 17 '25 03:03 bdraco

I don't think there are any security or request smuggling implications to doing that

I'm not so sure. If one parser is reading the next request (as it should do), and another parser is reading a body, that's a request smuggling attack.

There would literally be no way to distinguish between a body which looks like a request and a new request. Therefore I think that the bytes following the headers must always be treated as a new request, which is why there is a parsing error. At best, we could silence the parsing error in lax mode and just close the connection. The Python parser should probably also do the same behaviour (I assume it doesn't given your first attempt didn't error).

As curl sends and receives single requests, I assume that it doesn't use keep-alive connections, in which case it's not really a security issue for curl.

Dreamsorcerer avatar Mar 17 '25 10:03 Dreamsorcerer

Actually, the error says "Invalid character in chunk size". Could this actually be the opposite issue? That llhttp is trying to read the body when it's not supposed to. If llhttp was processing it as a new request (as it should be), then the error message should have been about an invalid response line.

Dreamsorcerer avatar Mar 17 '25 10:03 Dreamsorcerer

I'm wondering if there's a mistake around here: https://github.com/aio-libs/aiohttp/blob/28832b8922b2c20b76f999dc2ce48d163510cbbd/aiohttp/_http_parser.pyx#L455-L473

It looks like it'd assign a SteamReader to self._payload, then add an EMPTY_PAYLOAD to the messages. I wonder if it should be setting to an empty payload at the start..

Dreamsorcerer avatar Mar 17 '25 10:03 Dreamsorcerer

Created a test in #10587. Probably not going to look at it just yet (my guess at fixing it was wrong), but the C parser is behaving incorrectly and trying to parse the body.

If that is fixed, it might help out with this issue, as it may allow the response to be received and only error on the next request (which I assume is what happens with the Python parser).

Though the overall fix is obviously needed from Dropbox, who are sending invalid HTTP responses.

Dreamsorcerer avatar Mar 17 '25 12:03 Dreamsorcerer