cpython icon indicating copy to clipboard operation
cpython copied to clipboard

Slow IDNA decoding with large strings

Open guidovranken opened this issue 3 years ago • 2 comments

Bug report

Originally reported to the security address on September 9.

('xn--016c'+'a'*5000).encode('utf-8').decode('idna')

The execution time is not linear in relation to the input string size, which can cause slowness with large inputs:

10 chars = 0.016 seconds 100 chars = 0.047 seconds 1000 chars = 2.883 seconds 2500 chars = 17.724 seconds 5000 chars = 1 min 10 seconds

Comment by @tiran:

According to spec https://unicode.org/reports/tr46/ an IDNA label must not be longer than 63 characters. Python's idna module enforces the restriction, but too late.

This may be abused in some cases, for example by passing a crafted host name to asyncio create_connection:

import asyncio

async def main():
    loop = asyncio.get_running_loop()

    await loop.create_connection(
        lambda: [], ('xn--016c'+'a'*5000).encode('utf-8'), 443
    )

asyncio.run(main())

Your environment

  • CPython versions tested on: CPython repository 'main' branch checkout, version 3.8.12, version 2.7.18
  • Operating system and architecture: Ubuntu Linux x64
  • PR: gh-99092
  • PR: gh-99222
  • PR: gh-99229
  • PR: gh-99230
  • PR: gh-99231
  • PR: gh-99232

guidovranken avatar Oct 19 '22 06:10 guidovranken

This is probably in ToUnicode and ToASCII of https://github.com/python/cpython/blob/main/Lib/encodings/idna.py and/or in https://github.com/python/cpython/blob/main/Lib/encodings/punycode.py itself, where we could presumably just do an up front length check and reject inputs that are obviously too long to possibly decode into a label length that DNS standards will accept.

If there are libraries that allow an attacker controlled hostname without a reasonable length check on it to get into a connect or similar call that tries idna decoding, that'd make this remotely exploitable. Based solely on code inspection, the urllib.request.HTTPRedirectHandler class is probably vulnerable to this - https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L652 - the location or uri headers it consumes on a HTTP 302 redirect reponse to construct the new URL are not obviously limited, nor is the host that ultimately winds it way down into the socket module. (I didn't test this, I was just reading code) A test case would be to point urllib at a malicious server that sends a 2000 byte idna hostname in a 302 redirect header...

gpshead avatar Nov 04 '22 06:11 gpshead

The issue #99083 was marked as a duplicate of this issue.

vstinner avatar Nov 04 '22 09:11 vstinner

PRs are either merged or will be merged before the next release (marked as release-blockers) so I'm closing this.

A CVE id has been assigned CVE-2022-45061 for tracking purposes.

gpshead avatar Nov 09 '22 06:11 gpshead

I created https://python-security.readthedocs.io/vuln/slow-idna-large-strings.html to track this vulnerability. The fix is not merged into 3.8 and 3.9 branches yet.

vstinner avatar Nov 09 '22 10:11 vstinner