Produce a pure-Python verification API
Description
Some installers that may want to eventually perform signature verification have a hard requirement that all their dependencies are pure-Python (pip is the predominant example, because it vendors all its dependencies into a single pure-Python wheel).
Because sigstore-python has sub-dependencies that ship non-pure Python wheels, it's not immediately usable from these installers. However, installers will specifically only use a subset of our overall API (presumably just verification) and might not have a need for all the dependencies we have with native code.
Given that, we should:
- identify how, where and why native code is used as a sub-dependency of this project
- identity if any of those examples are dependencies for verification
- split our verification logic out into a separate, pure-Python library, with our existing verification API
- take a dependency on that library to provide the same API here
At a high level, looking at current sub-dependencies that ship non-pure Python wheels or have sub-dependencies that ship non-pure Python wheels shows the following:
cffi==1.15.1(impure)- via
cryptography==41.0.3(impure)- via
pyopenssl==23.2.0(pure)
- via
- via
cryptography==41.0.3(impure)charset-normalizer==3.2.0(impure)- via
requests==2.31.0(pure)
- via
multidict==6.0.4(impure)- via
grpclib==0.4.5(pure)- via
betterproto==2.0.0b5(pure)- via
sigstore-protobuf-specs==0.1.0(pure)
- via
- via
- via
pydantic==1.10.12(impure) (this will be resolved in our 2.0 release when we upgrade topydantic >= 2,< 3)- via
id==1.1.0(pure)
- via
charset-normalizer==3.2.0(impure)
charset-normalizer has a universal wheel too
multidict==6.0.4(impure)
Multidict claims that the library has optional C Extensions for speed. There's no universal wheel though, this will need a closer look.
Interesting, I wonder why they ship impure wheels as well.
To address cryptography and friends, the elephants in the room 🙂
- X.509 certificate parsing is currently done via
cryptography, which implements it in pure Rust (subsequent chain building is done viapyOpenSSL, which uses C to call into an OpenSSL or OpenSSL-like backend) - Signature verification (SET, SCT, certificate) is similar (calls into C via
cffiincryptography) - Small associated bits are also written in Rust internally (SCT parsing)
- Transitively, we also depend on things like PEM parsing (since we accept certificates/chains in PEM format)
On that front, there's currently an effort (which I'm working on with others at ToB) to support X.509 path building in cryptography with a pure Rust implementation (https://github.com/pyca/cryptography/pull/9405, https://github.com/pyca/cryptography/pull/8873), meaning that a future version of sigstore-python hopefully won't need pyOpenSSL at all, which will also remove the cffi dep. However, that just exchanges one native dep (C) for another (Rust), so that is potentially not immediately useful here, besides reducing the overall total number of native deps 🙂
TL;DR: When path validation is merged, it should be possible to eliminate pyOpenSSL and cffi as dependencies, although cryptography will continue to be an impure dep (and we will further rely on its native bits).
Removing cryptography outright is a bigger challenge, and I can see two (non-exhaustive) possibilites:
- "The hard way": reimplement the parts we care about (PEM parsing, DER parsing, X.509, path validation) in pure Python. This will be a significant effort, and (I believe) the
cryptographymaintainers probably won't want to upstream it (since they're all in on Rust for both performance and reliability reasons). - "The cheating way": convince CPython (and other major Python distributions?) to bundle
cryptography, either as a public API (probably a hard sell) or an implementation detail.pipcould then depend oncryptography's native bits without actually vendoring it. This entails more or less binding CPython's supported architectures list to Rust's, which may or may not be a "pro" in the eyes of the maintainers 🙂
Documenting the native code requirements is a very good idea, but for the end goal we'll also want to look at the dependency tree as a whole: if the subset of the dependency tree (that is not part of e.g. pip dependency tree already) is too large, then pip maintainers might not be enthusiastic about vendoring attempts.
The point I'm making is that putting a lot of effort into fixing the native code situation is not useful if the end result will still be unacceptable for vendoring because of the size of the dependency tree...
multidict==6.0.4(impure)Multidict claims that the library has optional C Extensions for speed. There's no universal wheel though, this will need a closer look.
This looks like a build system issue: it's supported but the CD builder just doesn't build the universal wheel