httpx
httpx copied to clipboard
Change default encoding to utf-8 in `normalize_header_key` and `normalize_header_value` functions
Description
I have encountered decoding errors with some requests that use ASCII encoding. Changing the default encoding to UTF-8 resolves these errors. I propose updating the normalize_header_key
and normalize_header_value
functions in _utils.py
to use UTF-8 as the default encoding.
Steps to Reproduce
- Call
normalize_header_key
ornormalize_header_value
with a non-ASCII string and no encoding specified. - Observe the decoding failure with ASCII encoding.
- Change the encoding to UTF-8 and observe that the error is resolved.
Example Code
header_key_unicode = "内容类型"
normalized_key_unicode = normalize_header_key(header_key_unicode, lower=True)
# This raises a UnicodeEncodeError with ASCII encoding.
normalized_key_unicode_utf8 = normalize_header_key(header_key_unicode, lower=True, encoding="utf-8")
print(normalized_key_unicode_utf8) # Works correctly with UTF-8 encoding.
Proposed Solution
Modify the _utils.py file to use UTF-8 as the default encoding:
def normalize_header_key(
value: str | bytes,
lower: bool,
encoding: str | None = None,
) -> bytes:
"""
Coerce str/bytes into a strictly byte-wise HTTP header key.
"""
if isinstance(value, bytes):
bytes_value = value
else:
bytes_value = value.encode(encoding or "utf-8")
return bytes_value.lower() if lower else bytes_value
def normalize_header_value(value: str | bytes, encoding: str | None = None) -> bytes:
"""
Coerce str/bytes into a strictly byte-wise HTTP header value.
"""
if isinstance(value, bytes):
return value
return value.encode(encoding or "utf-8")
Rationale
Using UTF-8 as the default encoding ensures that the functions can handle a wider range of input values without raising an error. UTF-8 encoding is capable of encoding a larger set of characters compared to ASCII.