Use hardened `Sha1` implementations (collision detection)
- [ ] implement collision detection in
sha-1crate - [ ] See if
sha1_smolimplements hardening (https://github.com/mitsuhiko/sha1-smol/issues/43)
At least add a feature toggle to support this crate:
- [ ] https://docs.rs/sha1collisiondetection
Some additional details:
- git uses hardened sha1 by default so there might be a chance that gitoxide messes up an object graph by generating different hashes then git does (needs investigation)
- this case is extremely rare as the probability of differing in honest usage of less than $2^{-90}$
- sha1collisiondetection is a straight port from C to rust using
c2rustwithout much modifications on top. So it is insanely unsafe (nearly 100% unsafe code). Its likely also slower then the current sha1 algorithm (implemented in assembly). - a paper describing the algrithm can be found here.
- GitHub an other hosting provides likely place a high priority on this (for example https://github.blog/2017-03-20-sha-1-collision-detection-on-github-com/)
I think implementing this in the sha-1 crate would be the best option but also extremely challenging (its partly written in assembly). Perhaps adding this ~~to the sha1_soml crate~~ to the rust version of the sha crate (so with the asm flag disabled) first would be easier (and adding some kind of safety feature). That library only contains two unsafe blocks so it might be preferable from a security point of view and also easier to contribute to (it also does not contain any assembly).
I was wondering what should happen once a collision is detected, as it's critical to do the same thing that git does in that case.
Options are…
- fail and abort the operation that caused the collision, effectively preventing the object to be added
- generate a different, safer hash instead
It looks like git is taking approach number 2, and creates a different hash instead. By the looks of it, they can control various flags related to collision detection at runtime, but initialize these to be as safe as possible (detect collision, do unavoidable attack condition checks, and generate a safe hash with additional rounds).
For our implementation to be valid, we'd have to be sure that our safe hashes are indeed the same as the ones that git produces. Unfortunately I wasn't able to find a test there.
libgit2 has a couple of tests fortunately, and the implementation git borrowed its SHA1 detection code from has some tests as well.
With that, it feels like it's really up to gitoxide on how to implement it, with the bare minimum being the detection of collision so errors can be produced. Unfortunately there is no configuration value to determine if safe hashes should be used or not, so gitoxide can turn it on by default like git seems to, with ways to turn it off as well (to produce errors instead).
Fyi I have a draft of this here open now: https://github.com/RustCrypto/hashes/pull/566 would be great to get feedback if this will work for gitoxide, and if not what needs to change
Thanks for the update, thanks so much for this amazing and exciting work! From what I can tell, gitoxide can configure collision detection merely by turning on the new feature toggle, which then enables it for sha1::Sha1. This will definitely work.
Something I wonder is if I will would be able to choose at runtime which implementation to use - I can imagine the new one to be quite a bit slower by merit of not being in ASM, so maybe I would want to have a git-configuration option for it rather than a compile-time-only option.
Something I wonder is if I will would be able to choose at runtime which implementation to use - I can imagine the new one to be quite a bit slower by merit of not being in ASM, so maybe I would want to have a git-configuration option for it rather than a compile-time-only option.
I have updated the PR to allow for both options to be compiled in, and making the switch much more explicit
The build options for git, and its various uses of sha1 implementations can be found here: https://github.com/git/git/blob/master/Makefile#L480-L534
TLDR: use a fast optimized implementation, unless specific build flags are specified. So my understanding is that they effectively do not use the collision detection code in any of the default builds, which is quite surprising to me. Not sure where else to look for confirmation on that.
In regards to how to setup a sha1 implementation that is "good enough" for gitoxide, probably the following makes sense
Combine the following three crates,
- the new
sha1-checked - the existing
[email protected] - asm fallbacks using
sha1-asmuntil they have been migrated to inline assembly intosha1
This should give roughly matching speeds with git asfaiu, assuming the same configurations.
Longer term there is likely an opportunity to optimize the sha1-checked using some assembly code (though as I wrote in the initial PR, likely not using the intrinsics).
Thanks a lot for the analysis, this makes sense to me and should be good enough.
And I hope one day gitoxide can take more responsibility for this dependency given its importance, to have something like gix-sha1 that can do it all just like Git (and with similar performance).