cpython icon indicating copy to clipboard operation
cpython copied to clipboard

gh-99108: Import SHA2-224 and SHA2-256 from HACL*

Open msprotz opened this issue 3 years ago • 15 comments
trafficstars

Issue #99108 is about replacing hashlib primitives (for the non-OpenSSL case) with verified implementations from HACL*. This is the first PR in the series, and focuses specifically on SHA2-256 and SHA2-224.

This PR imports Hacl_Streaming_SHA2 into the Python tree. This is the HACL* implementation of SHA2, which combines a core implementation of SHA2 along with a layer of buffer management that allows updating the digest with any number of bytes. This supersedes the previous implementation in the tree.

@franziskuskiefer was kind enough to benchmark the changes: in addition to being verified (thus providing significant safety and security improvements), this implementation also provides a sizeable performance boost!

---------------------------------------------------------------
Benchmark                     Time             CPU   Iterations
---------------------------------------------------------------
Sha2_256_Streaming            3163 ns      3160 ns       219353     // this PR
LibTomCrypt_Sha2_256          5057 ns      5056 ns       136234     // library used by Python currently

The changes in this PR are as follows:

  • import the subset of HACL* that covers SHA2-256/224 into Modules/_hacl
  • rewire sha256module.c to use the HACL* implementation

To review the changes, one should focus on sha256module.c, and possibly Hacl_Streaming_SHA2.h which is the API entry point. The rest of the files are part of HACL* and are cherry-picked straight from the HACL* repository, so as to minimize the amount of hand-edits and facilitate future refreshes of the code, should HACL* land some performance improvements.

The original code comes from https://github.com/project-everest/hacl-star/tree/master/dist/gcc-compatible. The various .h files in include/, krmllib/, etc. will be shared across (hopefully) future primitives imported from HACL*. I can trim those files to minimize the amount of includes, but then this will make refreshing the code in the future harder (since the trimming will have to be redone for each refresh).

The code was originally authored by @karthikbhargavan and I; @polubelova and @r1km provided a considerable amount of help over the past week, as we were overhauling our implementation to offer a significant performance boost (we want this first PR to be a good one!). Thanks to everyone involved in this team effort.

There are a couple remarks left in the source code, but for now, I think it's better to open up a PR and start discussing. As promised, tagging @alex for this PR. Thanks so much!

  • Issue: gh-99108

msprotz avatar Nov 04 '22 22:11 msprotz

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

bedevere-bot avatar Nov 04 '22 22:11 bedevere-bot

All commit authors signed the Contributor License Agreement.
CLA signed

cpython-cla-bot[bot] avatar Nov 04 '22 22:11 cpython-cla-bot[bot]

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

bedevere-bot avatar Nov 04 '22 22:11 bedevere-bot

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

bedevere-bot avatar Nov 07 '22 17:11 bedevere-bot

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

bedevere-bot avatar Nov 07 '22 17:11 bedevere-bot

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

bedevere-bot avatar Nov 07 '22 19:11 bedevere-bot

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

bedevere-bot avatar Nov 07 '22 20:11 bedevere-bot

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

bedevere-bot avatar Nov 07 '22 22:11 bedevere-bot

Please include a benchmark comparing to the OpenSSL based sha2 as well.

gpshead avatar Nov 07 '22 23:11 gpshead

Please include a benchmark comparing to the OpenSSL based sha2 as well.

Adding to the numbers in the PR description https://github.com/python/cpython/pull/99109#issue-1436683253 we get the following. Note that OpenSSL is compiled without assembly support (no-asm no-hw) to get a fair comparison.

OpenSSL_Sha2_256           3000 ns         2999 ns       233849

franziskuskiefer avatar Nov 08 '22 16:11 franziskuskiefer

Note that OpenSSL is compiled without assembly support (no-asm no-hw) to get a fair comparison.

LOL, thanks, but fair enough. 😄 This does at least answer my broader question about the feasibility of also getting rid of the OpenSSL backed _hashlib in the long run: not today.

gpshead avatar Nov 08 '22 17:11 gpshead

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

bedevere-bot avatar Nov 08 '22 19:11 bedevere-bot

To keep the discussion general and not related to sha256, I responded on the issue regarding performance, ASM vs no-ASM, etc.

I'm still facing build issues, in which the _hacl directory is not being created, and the compiler can't write out the .o file. This seems to be specific to the scenario where the build output directory is outside of the source tree. The error happens early on for Linux, and later on for OSX. If that sounds familiar to anyone (@tiran?) I'd be happy to take pointers on how to fix it. Thanks.

msprotz avatar Nov 08 '22 20:11 msprotz

Checkout SRCDIRS in configure.ac.

tiran avatar Nov 08 '22 20:11 tiran

Thanks to @zooba for helping me fix the Windows build.

@gpshead to answer your earlier comment: while this PR is a "pure C" version, here are our performance numbers for our version (not submitted as part of this PR) that leverages the SHA-NI instruction, when present. This is on a Ryzen 5.

EverCrypt_Sha2_256_Streaming        776 ns          776 ns       907329 // universal streaming API for all hashes, but calling SHA-NI when available
OpenSSL_Sha2_256                    524 ns          524 ns      1334337 // with asm optimizations

We haven't worked on optimizing this specific codepath, so I'm confident that we could drive the number lower and be almost the same as OpenSSL. This is also on a relatively small sample size, so numbers would probably be almost equal for larger input sizes.

Bottom line is, there are ways to make verified code on-par with OpenSSL... this PR is just a first step.

msprotz avatar Nov 09 '22 18:11 msprotz

hi everyone, just a friendly ping on this PR; is there any way I can help the discussion make progress? happy to provide more context, explanations, or perf measurements if any of these would be helpful... thanks!

msprotz avatar Dec 01 '22 00:12 msprotz

It'd be useful for this kind of thing to include a script (somewhere under Tools?) that can generate and update the hacl-star vendored code implementation. As I assume we'll not want this to be a one time code drop but periodically update it, at least for every major Python release, as hacl-star evolves.

this comment still applies.

gpshead avatar Dec 01 '22 00:12 gpshead

this comment still applies.

I'm happy to do that, although I touched up a couple things manually, so it might take a few PRs in HACL to get the generated C code to be exactly like the one I have in this PR.

Would it be ok to add the script as part of a followup PR, or would it be more strategic (meaning, increased likelihood of a PR accepted!) to have it here now along with the initial PR?

msprotz avatar Dec 01 '22 00:12 msprotz

After a few weeks of upstream work, here is a revised PR. Highlights:

  • thanks to upstream proofs showing that various internal steps (update, update_blocks, update_last) are identical for SHA2-224 and SHA2-256, the vendored files are about 400 lines smaller now
  • almost all of the manual edits that I had performed earlier are now fixed upstream in the verified codebase, meaning that I was able to write a bash script that imports the required files automatically from upstream into the CPython repository
  • we now implement the missing state copy function
  • I fixed handling of more than 4GB of data, along with a test
  • I added a script to automatically refresh the vendored code from upstream.

Regarding that last point, the script:

  • cherry-picks the 6 files currently imported into the CPython tree
  • manually removes a series of #includes that are not necessary for the specific subset of HACL files we import here
  • removes sha2-384 and sha2-512 from the Hacl_SHA2_Streaming file (I want to keep this initial PR small and easy to review)
  • trims the default headers to remove definitions that are in general useful for the HACL* toolchain but not used by SHA2-256 specifically

The last three steps are manual cleanups, and are not strictly necessary: without them, there would be a few more unused definitions and headers. But I want to keep this initial PR neat and tidy, hence the manual elimination of definitions that are generally useful, but not used by SHA2-256 specifically.

msprotz avatar Dec 20 '22 21:12 msprotz

@gpshead Seems you are the one to see this through to the end. Could you see what's left to do?

gvanrossum avatar Jan 25 '23 18:01 gvanrossum

Okay, I think this is ready. I pushed a few changes to the PR branch:

  • Improvements to refresh.sh.
  • A README.md in Modules/_hacl/ to go along with this third party code.
  • Used C #define hacks to "namespace" the Hacl_ prefixed symbols from the .c file so that they can't cause ODR violations if a process winds up linking with other code that linked against its own version of HACL*. (we do this on some of our other third party things within CPython).

The list confined to its own .h file was manually generated and thus needs manual updating, but I doubt the list will change often. I'd like a test to confirm that no Hacl_ C symbols exist in _sha256.so or _sha256.pyd... but that can remain a future todo wishlist item for now.

gpshead avatar Jan 31 '23 07:01 gpshead

Thanks both @gpshead and @erlend-aasland!

@erlend-aasland what warning are you seeing? Is it a cast from python's size_t to uint32_t? I wasn't able to reproduce (neither on gcc-11/x86_64/Linux, nor {clang,gcc-12}/x86_64/OSX), but will happily add a cast if that's what you're seeing.

msprotz avatar Jan 31 '23 16:01 msprotz

Yeah, it's a cast warning. You'll see it in the diff tab on the PR; it's one of the GitHub windows runners that's complaining, IIRC (I'm on mobile now).

erlend-aasland avatar Jan 31 '23 16:01 erlend-aasland

I fixed the final review comments... @erlend-aasland do I need to re-request a review from you, or is there anything I can do? thanks!

msprotz avatar Feb 07 '23 00:02 msprotz

Yup! I'll respond on the issue so that we don't scatter the overall discussion across individual PRs.

msprotz avatar Feb 07 '23 04:02 msprotz