gitoxide icon indicating copy to clipboard operation
gitoxide copied to clipboard

Add `ObjectId::Sha256` and `Kind::Sha256`

Open cruessler opened this issue 2 weeks ago • 1 comments

This is a draft that adds ObjectId::Sha256 and Kind::Sha256 to gix-hash. I’m opening this to get early feedback, in particular from a CI run.

Some remarks:

  • It feels wrong to put empty_blob_sha256 and empty_tree_sha256 into oid as the comment says oid is related to Sha1, but I’m unsure how to best resolve this tension.
  • There’s inconsistencies with respect to spelling: SHA256, SHA-256, Sha256.
  • ObjectId::null_sha1 hardcodes 20. Should this be SIZE_OF_SHA1_DIGEST? oid::null_sha1 uses SIZE_OF_SHA1_DIGEST.
  • Kind::len_in_hex and Kind::len_in_bytes also hardcode values.
  • Hardcoded empty trees and empty blobs appear in more than place (for both SHA-1 and SHA-256). I feel like we should have a single source of truth here.
  • Some comments in ObjectId would probably benefit from being rephrased. See e. g. ObjectId::new_sha1 or ObjectId::from_20_bytes.

Sources for the hardcoded values related to SHA-256:

https://github.com/git/git/blob/e85ae279b0d58edc2f4c3fd5ac391b51e1223985/hash.h https://github.com/git/git/blob/e85ae279b0d58edc2f4c3fd5ac391b51e1223985/hash.c

cruessler avatar Dec 10 '25 19:12 cruessler

You are on fire :)! Let me share some early thoughts, without having looked at the code at all.

  • It feels wrong to put empty_blob_sha256 and empty_tree_sha256 into oid as the comment says oid is related to Sha1, but I’m unsure how to best resolve this tension.

And indeed, &oid is supposed to represent any hash. if empty_blob somehow is hardcoded to always be SHA1, it's a bug and it needs to take a gix-hash::Kind. The empty_something could also be implemented on gix-hash::Kind while at it.

  • There’s inconsistencies with respect to spelling: SHA256, SHA-256, Sha256.

Should we fix them in favor of what Git seems to do: SHA256?

Screenshot 2025-12-11 at 04 59 53

I am definitely open to your suggestions.

  • ObjectId::null_sha1 hardcodes 20. Should this be SIZE_OF_SHA1_DIGEST? oid::null_sha1 uses SIZE_OF_SHA1_DIGEST.

Yes, that sounds like an oversight.

  • Kind::len_in_hex and Kind::len_in_bytes also hardcode values.

Yes, let's use constants where we have them. And since there is already a precedent with SIZE_OF_SHA1_DIGEST we might also have the respective SHA256 versions and use them.

  • Hardcoded empty trees and empty blobs appear in more than place (for both SHA-1 and SHA-256). I feel like we should have a single source of truth here.

I am open to suggestions as to where the empty-* methods should be, but would also not lightheartedly introduce breaking changes. If the new way feels better, we should have it. By now, I always think in terms of gix-hash::Kind::null()/empty_*() by the way, which aligns well with repo.object_hash().…()

  • Some comments in ObjectId would probably benefit from being rephrased. See e. g. ObjectId::new_sha1 or ObjectId::from_20_bytes.

Yes, it's a good time to unify this and make it less 'hardcoded'.

Thanks again 🙏

Byron avatar Dec 11 '25 04:12 Byron