codec name acceptance became way too lenient in 3.9
| BPO | 46508 |
|---|---|
| Nosy | @gpshead |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
assignee = None
closed_at = None
created_at = <Date 2022-01-25.00:12:22.597>
labels = ['type-bug', '3.9', '3.10', '3.11']
title = 'codec name acceptance became way too lenient in 3.9'
updated_at = <Date 2022-01-25.00:37:48.134>
user = 'https://github.com/gpshead'
bugs.python.org fields:
activity = <Date 2022-01-25.00:37:48.134>
actor = 'gregory.p.smith'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = []
creation = <Date 2022-01-25.00:12:22.597>
creator = 'gregory.p.smith'
dependencies = []
files = []
hgrepos = []
issue_num = 46508
keywords = ['3.9regression']
message_count = 2.0
messages = ['411535', '411540']
nosy_count = 1.0
nosy_names = ['gregory.p.smith']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'needs patch'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue46508'
versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']
in 3.8 this was not a valid codec name: "เ_เ_เ_iDnA" in 3.9 it gets treated as idna and triggers the punycode decoder when passed to bytes.decode(codec).
Discovered by oss-fuzz.
Likely a consequence of https://bugs.python.org/issue37751 aka https://github.com/python/cpython/issues/81932
The consequences of this change are that anyone can stuff heinous strings into codec names and get a non-LookupError behavior out of them. Anywhere codecs can be part of user input this has many interesting potential negative consequences.
<=3.8 gave LookupError("unknown encoding: ...
while figuring this issue out, it may also make sense to address https://github.com/python/cpython/issues/88886 as well.
oss-fuzz issue that discovered this: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=44030#c9 (opening to the public next week)