git-scm.com icon indicating copy to clipboard operation
git-scm.com copied to clipboard

Remove claims of 'cryptographic integrity'

Open msgilligan opened this issue 8 years ago • 9 comments

I see the words "cryptographic integrity" and "strong cryptographic integrity" in two files:

  • app/views/about/_data_assurance.html.erb here
  • app/views/blog/posts/2010-08-25-notes.markdown here

This should be rewritten to say hash-based integrity or something more realistic.

msgilligan avatar Feb 23 '17 16:02 msgilligan

FWIW I had to tell a client the other day that they couldn't use Git for a document authentication scheme because of SHA1 collisions (the documents in question were worth enough to make $100k+ attacks economically feasible).

petertodd avatar Feb 23 '17 16:02 petertodd

@peff or anyone with a bigger knowledge on git + security that could offer some comment on this one?

pedrorijo91 avatar Feb 24 '17 20:02 pedrorijo91

It's not clear to me what the best path forward is. There is cryptographic integrity. But with lots of caveats related to sha1, and how collisions work, and what attacks might look like, etc. If the proposal is just s/cryptographic integrity/hash-based integrity/, I guess that does not hurt, but nor does it remotely tell the whole picture (both positives and negatives).

It's also not clear yet what mitigations are going to look like. Based on what is known now, it seems there is a very good chance that Git can simply reject these intentionally-colliding contents and maintain the security principles in practice.

peff avatar Feb 24 '17 22:02 peff

Well there certainly isn't "strong" cryptographic integrity and this is clearly wrong:

The data model that Git uses ensures the cryptographic integrity of every bit
of your project.  Every file and commit is checksummed and retrieved by its
checksum when checked back out.  It's impossible to get anything out of Git
other than the <strong>exact bits you put in</strong>

It is also impossible to change any file, date, commit message, or any other
data in a Git repository without changing the IDs of everything after it.
This means that if you have a commit ID, you can be assured not only that
your project is exactly the same as when it was committed, but
that nothing in its history was changed.

This shouldn't be too hard to rewrite. Change "strong" to "weak", maybe replace "cryptographic integrity" with "hash-based integrity" as @peff suggests. Replace "impossible" with "difficult" (I don't understand all the possible exploits and it's possible that "difficult" is the wrong word, too.) et cetera.

msgilligan avatar Feb 24 '17 23:02 msgilligan

I care a lot less about the blog posts. I think their future is to be destroyed with :fire:. Any useful content there made it into the second edition of Pro Git, which we mirror on the site (as an aside, anybody interested in this topic should probably be going over the contents of https://github.com/progit/progit2).

The data-assurance blurb in the "about" page is definitely worth re-visiting, though. PRs welcome.

peff avatar Feb 24 '17 23:02 peff

@petertodd Hmmm, wonder if GPG signing of commits would have affected that. With GPG signatures enforced on every commit, wouldn't that add an effective cryptographically secure er... overlay layer?

justinclift avatar Sep 26 '17 09:09 justinclift

GPG signatures still sign a insecure SHA1 commit hash, so no.

This is why the Bitcoin Core project now has code that rehashes the tree with SHA512 when doing merges: https://github.com/bitcoin/bitcoin/blob/dabee00ef1a7a2857c3318e898d3f63f79853048/contrib/devtools/github-merge.py#L248

Not a complete solution, as that SHA512 hash doesn't cover git history itself, but it's a lot better than doing nothing at all.

petertodd avatar Sep 26 '17 15:09 petertodd

with the move of git to other hashing functions (https://github.com/git/git/commit/721cc4314cb593e799213ad5f926a1e9fc5779b0) and other possible changes/improvements introduced to git since this issue, can we reach a consensus on how to define git relatively to its cryptographic integrity @peff ?

pedrorijo91 avatar May 04 '18 23:05 pedrorijo91

To be honest, I'm not sure anything needs to be done here. Yes, SHA-1 has problems, and they're not likely to get better. But the best known attacks aren't viable (not even in an economic sense, but that they're based on a technique that leaves traces, and by default Git is built with a SHA-1 library that will detect and reject objects with those traces). Certainly it's still worth moving off of SHA-1, but any big overhaul of the discussion of cryptographic properties may want to wait until the hash transition is further along.

That said, if somebody is really bothered by the current language, they're welcome to open a PR. Of the site-specific content, it should just be the about page, since that blog post has gone away. I suspect the Pro Git content would want some looking over, though.

peff avatar May 07 '18 06:05 peff