cpython
cpython copied to clipboard
gh-99108: Import SHA2-224 and SHA2-256 from HACL*
Issue #99108 is about replacing hashlib primitives (for the non-OpenSSL case) with verified implementations from HACL*. This is the first PR in the series, and focuses specifically on SHA2-256 and SHA2-224.
This PR imports Hacl_Streaming_SHA2 into the Python tree. This is the HACL* implementation of SHA2, which combines a core implementation of SHA2 along with a layer of buffer management that allows updating the digest with any number of bytes. This supersedes the previous implementation in the tree.
@franziskuskiefer was kind enough to benchmark the changes: in addition to being verified (thus providing significant safety and security improvements), this implementation also provides a sizeable performance boost!
---------------------------------------------------------------
Benchmark Time CPU Iterations
---------------------------------------------------------------
Sha2_256_Streaming 3163 ns 3160 ns 219353 // this PR
LibTomCrypt_Sha2_256 5057 ns 5056 ns 136234 // library used by Python currently
The changes in this PR are as follows:
- import the subset of HACL* that covers SHA2-256/224 into
Modules/_hacl - rewire sha256module.c to use the HACL* implementation
To review the changes, one should focus on sha256module.c, and possibly Hacl_Streaming_SHA2.h which is the API entry point. The rest of the files are part of HACL* and are cherry-picked straight from the HACL* repository, so as to minimize the amount of hand-edits and facilitate future refreshes of the code, should HACL* land some performance improvements.
The original code comes from https://github.com/project-everest/hacl-star/tree/master/dist/gcc-compatible. The various .h files in include/, krmllib/, etc. will be shared across (hopefully) future primitives imported from HACL*. I can trim those files to minimize the amount of includes, but then this will make refreshing the code in the future harder (since the trimming will have to be redone for each refresh).
The code was originally authored by @karthikbhargavan and I; @polubelova and @r1km provided a considerable amount of help over the past week, as we were overhauling our implementation to offer a significant performance boost (we want this first PR to be a good one!). Thanks to everyone involved in this team effort.
There are a couple remarks left in the source code, but for now, I think it's better to open up a PR and start discussing. As promised, tagging @alex for this PR. Thanks so much!
- Issue: gh-99108
Most changes to Python require a NEWS entry.
Please add it using the blurb_it web app or the blurb command-line tool.
Most changes to Python require a NEWS entry.
Please add it using the blurb_it web app or the blurb command-line tool.
Most changes to Python require a NEWS entry.
Please add it using the blurb_it web app or the blurb command-line tool.
Most changes to Python require a NEWS entry.
Please add it using the blurb_it web app or the blurb command-line tool.
Most changes to Python require a NEWS entry.
Please add it using the blurb_it web app or the blurb command-line tool.
Most changes to Python require a NEWS entry.
Please add it using the blurb_it web app or the blurb command-line tool.
Most changes to Python require a NEWS entry.
Please add it using the blurb_it web app or the blurb command-line tool.
Please include a benchmark comparing to the OpenSSL based sha2 as well.
Please include a benchmark comparing to the OpenSSL based sha2 as well.
Adding to the numbers in the PR description https://github.com/python/cpython/pull/99109#issue-1436683253 we get the following. Note that OpenSSL is compiled without assembly support (no-asm no-hw) to get a fair comparison.
OpenSSL_Sha2_256 3000 ns 2999 ns 233849
Note that OpenSSL is compiled without assembly support (
no-asm no-hw) to get a fair comparison.
LOL, thanks, but fair enough. 😄 This does at least answer my broader question about the feasibility of also getting rid of the OpenSSL backed _hashlib in the long run: not today.
Most changes to Python require a NEWS entry.
Please add it using the blurb_it web app or the blurb command-line tool.
To keep the discussion general and not related to sha256, I responded on the issue regarding performance, ASM vs no-ASM, etc.
I'm still facing build issues, in which the _hacl directory is not being created, and the compiler can't write out the .o file. This seems to be specific to the scenario where the build output directory is outside of the source tree. The error happens early on for Linux, and later on for OSX. If that sounds familiar to anyone (@tiran?) I'd be happy to take pointers on how to fix it. Thanks.
Checkout SRCDIRS in configure.ac.
Thanks to @zooba for helping me fix the Windows build.
@gpshead to answer your earlier comment: while this PR is a "pure C" version, here are our performance numbers for our version (not submitted as part of this PR) that leverages the SHA-NI instruction, when present. This is on a Ryzen 5.
EverCrypt_Sha2_256_Streaming 776 ns 776 ns 907329 // universal streaming API for all hashes, but calling SHA-NI when available
OpenSSL_Sha2_256 524 ns 524 ns 1334337 // with asm optimizations
We haven't worked on optimizing this specific codepath, so I'm confident that we could drive the number lower and be almost the same as OpenSSL. This is also on a relatively small sample size, so numbers would probably be almost equal for larger input sizes.
Bottom line is, there are ways to make verified code on-par with OpenSSL... this PR is just a first step.
hi everyone, just a friendly ping on this PR; is there any way I can help the discussion make progress? happy to provide more context, explanations, or perf measurements if any of these would be helpful... thanks!
It'd be useful for this kind of thing to include a script (somewhere under Tools?) that can generate and update the hacl-star vendored code implementation. As I assume we'll not want this to be a one time code drop but periodically update it, at least for every major Python release, as hacl-star evolves.
this comment still applies.
this comment still applies.
I'm happy to do that, although I touched up a couple things manually, so it might take a few PRs in HACL to get the generated C code to be exactly like the one I have in this PR.
Would it be ok to add the script as part of a followup PR, or would it be more strategic (meaning, increased likelihood of a PR accepted!) to have it here now along with the initial PR?
After a few weeks of upstream work, here is a revised PR. Highlights:
- thanks to upstream proofs showing that various internal steps (update, update_blocks, update_last) are identical for SHA2-224 and SHA2-256, the vendored files are about 400 lines smaller now
- almost all of the manual edits that I had performed earlier are now fixed upstream in the verified codebase, meaning that I was able to write a bash script that imports the required files automatically from upstream into the CPython repository
- we now implement the missing state copy function
- I fixed handling of more than 4GB of data, along with a test
- I added a script to automatically refresh the vendored code from upstream.
Regarding that last point, the script:
- cherry-picks the 6 files currently imported into the CPython tree
- manually removes a series of #includes that are not necessary for the specific subset of HACL files we import here
- removes sha2-384 and sha2-512 from the
Hacl_SHA2_Streamingfile (I want to keep this initial PR small and easy to review) - trims the default headers to remove definitions that are in general useful for the HACL* toolchain but not used by SHA2-256 specifically
The last three steps are manual cleanups, and are not strictly necessary: without them, there would be a few more unused definitions and headers. But I want to keep this initial PR neat and tidy, hence the manual elimination of definitions that are generally useful, but not used by SHA2-256 specifically.
@gpshead Seems you are the one to see this through to the end. Could you see what's left to do?
Okay, I think this is ready. I pushed a few changes to the PR branch:
- Improvements to refresh.sh.
- A README.md in Modules/_hacl/ to go along with this third party code.
- Used C
#definehacks to "namespace" theHacl_prefixed symbols from the .c file so that they can't cause ODR violations if a process winds up linking with other code that linked against its own version of HACL*. (we do this on some of our other third party things within CPython).
The list confined to its own .h file was manually generated and thus needs manual updating, but I doubt the list will change often. I'd like a test to confirm that no Hacl_ C symbols exist in _sha256.so or _sha256.pyd... but that can remain a future todo wishlist item for now.
Thanks both @gpshead and @erlend-aasland!
@erlend-aasland what warning are you seeing? Is it a cast from python's size_t to uint32_t? I wasn't able to reproduce (neither on gcc-11/x86_64/Linux, nor {clang,gcc-12}/x86_64/OSX), but will happily add a cast if that's what you're seeing.
Yeah, it's a cast warning. You'll see it in the diff tab on the PR; it's one of the GitHub windows runners that's complaining, IIRC (I'm on mobile now).
I fixed the final review comments... @erlend-aasland do I need to re-request a review from you, or is there anything I can do? thanks!
Yup! I'll respond on the issue so that we don't scatter the overall discussion across individual PRs.