requests
requests copied to clipboard
HTTPDigestAuth fails on non-latin credentials
There was issue reported, which is closed with bad results.
https://github.com/psf/requests/blob/4f6c0187150af09d085c03096504934eb91c7a9e/requests/auth.py#L59-L63
Don't pass unicode strings in the arguments, but use UTF8 bytes instead.
self.session.get(main_url, auth=requests.auth.HTTPDigestAuth("Сергей_Ласточкин".encode('UTF-8'), '1234'))
Originally posted by @D-stefaang in https://github.com/psf/requests/issues/5089#issuecomment-763569911
But this is wrong! When i try to set user 'Ondřej' with this advice, requests send bad string:
HTTPDigestAuth('Ondřej'.encode('utf-8'), 'heslíčko')
creates header starts with wrong username!
Digest username="b'Ond\xc5\x99ej'"
Hi @ondratu,
Could you please clarify what you believe is wrong in this case? ř is the byte-sequence \xc5\x99 in UTF-8, so we'd expect the bytes object to be Ond\xc5\x99ej. We can quickly verify this by checking:
'Ondřej'.encode('utf-8') == b'Ond\xc5\x99ej'
>>> True
It's not clear what other value you'd be expecting.
Hmm, on closer inspection this does appear to be a bug. We're using the bytes username as an argument to format our string for the header. This causes the full literal "b'Ond\xc5\x99ej'" to be used which I agree doesn't look correct. This header should be encoded as bytes during creation but we currently defer that to be urllib3's problem.
When not encoding the auth/password, we get this:
Traceback (most recent call last):
File "/Users/nateprewitt/Work/OpenSource/requests/test.py", line 3, in <module>
r = requests.get('https://httpbin.org/digest-auth/auth/Ondřej/heslíčko', auth=h)
File "/Users/nateprewitt/Work/OpenSource/requests/requests/api.py", line 73, in get
return request("get", url, params=params, **kwargs)
[...]
File "/Users/nateprewitt/.pyenv/versions/3.10.3/lib/python3.10/http/client.py", line 1323, in _send_request
self.putheader(hdr, value)
File "/Users/nateprewitt/.pyenv/versions/3.10.3/lib/python3.10/site-packages/urllib3/connection.py", line 224, in putheader
_HTTPConnection.putheader(self, header, *values)
File "/Users/nateprewitt/.pyenv/versions/3.10.3/lib/python3.10/http/client.py", line 1255, in putheader
values[i] = one_value.encode('latin-1')
UnicodeEncodeError: 'latin-1' codec can't encode character '\u0159' in position 20: ordinal not in range(256)
Fixing this is unfortunately somewhat complicated for a couple reasons:
1.) Users expect the output of HTTPDigestAuth to be a str in Python 3, changing that is likely breaking.
2.) If we were to encode the string, the libraries current convention would be latin-1 not utf-8. That wouldn't solve this issue.
I don't believe we can ever format this correctly in Python 3 with the current behavior though. I'll need to look more tomorrow, but we may consider a behavior change if self.username/self.password is passed in as bytes.