pythondotorg
pythondotorg copied to clipboard
Python download checksum in MD5
The python download checksum is given in MD5:
In my understanding, MD5 security is completely broken and thus MD5 should no longer be used. See https://en.wikipedia.org/wiki/MD5#Security for example.
So I kindly ask you to update your checksum algorithm. For example you could provide a SHA-256 checksum instead.
Thank you for the suggestion and it is something that is worth doing. It will require changes to both the web site and our release management process and both are volunteer-led efforts. More importantly, for each release file we already provide a GPG signature file (via the SIG link after each file) which provides more robust verification than a simple checksum.
This is a great suggestion, or at least do what nodejs does where they place the SHA256 checksum in a text file at the same location as the download, so that checksum validation is more scriptable. Example.... https://nodejs.org/download/release/v16.20.0
More recently, release artifacts have been signed with Sigstore, for example:
More info: https://www.python.org/download/sigstore/
@sethmlarson Do you think it's worth switching MD5 for something like SHA-256, or are people better off using Sigstore instead? Should we drop MD5?
Checking the MD5 is easy, and as long as it's provided, there will be people using that. If the MD5 can't be relied upon, it will just give people a false sense of security. Removing it will encourage people to use other more reliable methods (which should be documented clearly).
@ezio-melotti, @hugovk What do you all think about the future of this issue? Should we close it, or point to sigstore, or something else?
I would like to provide checksums that aren't MD5 (in addition to making the Sigstore verification steps more visible). Back-filling the correct values is in progress (cc @woodruffw) since we need to make sure the artifacts are the correct ones first (ie by verifying GPG sigs)
ISTM that two things could be done:
- replace MD5 checksums with SHA-256 ones
- add a paragraph that explains briefly how to verify the downloads and/or links to a different page with more exhaustive instructions
For cross-checking/backfilling purposes: https://github.com/woodruffw/cpython-release-tracker has a dump of every hosted version of CPython's release assets, including SHA256 hashes (which are computed when generating that repo).