Fix JWS verify
Note: This PR description is not yet finished, and I may rewrite some of these commits, so if you pull this branch, beware that you may need to rebase any changes you make to it!
Python 2 treated bytestrings and character strings as the same thing, a mistake that was thankfully fixed in Python 3. However, this did create a dilemma when updating this codebase to run on Python 3 - should module functions return bytestrings or character strings? Many, if not all, encryption packages sidestep this issue by foisting the distinction and conversion processes onto users, however, that makes implementing robust and correct applications more difficult.
Users of this package who have been running on Python 3 have come across many sharp edges of the bytes/str divide and reported them across multiple issues (#51, #153, #183, #184), and so with the update to Python 3, it has become imperative for this package to "take a stand" on the basic data structure it expects to handle.
I searched through RFC 7520 for "byte", "bytes", and "binary" and found only one instance:
All instances of binary octet strings are represented using base64url [RFC4648] encoding.
This indicates to me that the governing standard for this project does not want to deal with bytestrings directly, instead dealing with them indirectly via base64-urlencoding.
Python 2.7 makes base64-encoding bytestrings and strings easy, and both are base64-encoded to str:
Python 2.7.16 (default, Jan 27 2020, 04:46:15)
[GCC 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import base64
>>> base64.b64encode(b'Hello Binary World')
'SGVsbG8gQmluYXJ5IFdvcmxk'
>>> type(base64.b64encode(b'Hello Binary World'))
<type 'str'>
>>>
>>> base64.b64encode('Hello String World')
'SGVsbG8gU3RyaW5nIFdvcmxk'
>>> type(base64.b64encode('Hello String World'))
<type 'str'>
To summarize the types that are expected and returned:
bytes->strstr->str
Python 3.6 makes things more complex - users have to manually encode strings to bytes before base64-encoding them, and :
Python 3.6.4 (v3.6.4:d48ecebad5, Dec 18 2017, 21:07:28)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import base64
>>> base64.b64encode(b'Hello Binary World')
b'SGVsbG8gQmluYXJ5IFdvcmxk'
>>> type(base64.b64encode(b'Hello Binary World'))
<class 'bytes'>
>>>
>>> base64.b64encode('Hello String World')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../base64.py", line 58, in b64encode
encoded = binascii.b2a_base64(s, newline=False)
TypeError: a bytes-like object is required, not 'str'
>>> type(base64.b64encode('Hello String World'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../base64.py", line 58, in b64encode
encoded = binascii.b2a_base64(s, newline=False)
TypeError: a bytes-like object is required, not 'str'
Summarizing those type expectations:
bytes->bytes- ~
str-> (exception)~
So Python 2.7 will accept bytes and str and returns a str, but Python 3 only accepts bytes and returns bytes. This impacts everybody who is using Python 3, not just users of this library. Unfortunately, it's not the behavior we would like.
Codecov Report
Merging #189 into master will not change coverage. The diff coverage is
100.00%.
@@ Coverage Diff @@
## master #189 +/- ##
=======================================
Coverage 93.51% 93.51%
=======================================
Files 16 16
Lines 1728 1728
=======================================
Hits 1616 1616
Misses 112 112
| Impacted Files | Coverage Δ | |
|---|---|---|
| jose/backends/cryptography_backend.py | 93.20% <ø> (ø) |
|
| jose/backends/ecdsa_backend.py | 98.66% <ø> (ø) |
|
| jose/backends/native.py | 97.61% <ø> (ø) |
|
| jose/backends/pycrypto_backend.py | 93.03% <ø> (ø) |
|
| jose/backends/rsa_backend.py | 95.65% <ø> (ø) |
|
| jose/jwe.py | 82.46% <100.00%> (ø) |
|
| jose/jws.py | 96.85% <100.00%> (ø) |
|
| jose/jwt.py | 100.00% <100.00%> (ø) |
|
| jose/utils.py | 100.00% <100.00%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 82bd8aa...d073ad2. Read the comment docs.
Hello @blag , if this issue is the remaining blocker for a new release of python-jose, can you please tell a concrete test case you would expect?
It would ease validation of this feature (or at least the impact of this change on our codebase) if you were to release a .dev0 version on pypi.org
thanks again
+1 on a .dev0 release
Any update on a .dev0 release?