git-subhistory icon indicating copy to clipboard operation
git-subhistory copied to clipboard

Support Signing

Open tautropfli opened this issue 8 years ago • 4 comments

Summary of my changes

  • Creates a mapping between unsigned to (maybe) signed commits of subproject
  • Uses commit SHA from subproject if available
  • Adds support for signing new commits to subproject

tautropfli avatar Jan 10 '17 12:01 tautropfli

Oh gosh thanks so much! How did you hear about this project? Are you actually using git-subhistory for any of your projects? What's your use case?


Can you elaborate more on what this PR does, exactly? What's the new second argument to git subhistory split?

It looks something like: The optional second argument is a signed commit history of Sub. Before the normal filter-branch to do the actual splitting, you first run a filter-branch that looks like a no-op but actually strips off the signatures (if this is right you should comment this, the filter-branch manpage doesn't say anything about commit signatures, though it does mention it always strips tag signatures). You use that to create a map from hashes of unsigned Sub commits to signed Sub commits. Then during the filter-branch that does the actual splitting, if a Main commit is split into an (unsigned) Sub commit for which we already have a signed version, you replace it with that.

This makes sense; you don't do this but I assume the next step is to have git subhistory assimilate pass the assimilatee to split as the second argument, so if any signed commits were previously assimilated in, then they'll be split out as the correct hashes, and will be able to be correctly used as the merge base.

This doesn't make for a great split workflow though, right? From my understanding, the way this would have to work is:

  1. Sub upstream has signed commits
  2. We assimilate those into Main, but we have to keep around a branch with those signed commits
  3. We make changes to Sub that we want to send upstream (e.g. by pushing to a fork and submitting a PR)
  4. We split those changes out, passing that branch we kept around from Step 2 as the second argument

I guess it's not so bad assuming there's a remote-tracking branch (e.g. sub-upstream/master), but still.

One thing I've been thinking about is #8, which is the idea of permanently keeping around the Sub commits that correspond to assimilated commits, not only as an optimization but also to enable squash-merges; that could take care of tracking corresponding signed Sub commits too, no second argument to split necessary. Thoughts?


Seemingly unrelated, another thing it looks like this PR does is if -S/--sign is provided, split and assimilate both sign every commit that's by the current user. I dunno about this, it means different people splitting the same Main commits can produce different (-ly signed) Sub commits, right?

One idea that seems reasonable to me is, if a Main commit is by the current user and is already signed, then automatically sign it. If I end up with someone else's signed organic Main commit modifying Sub without a corresponding signed Sub commit, fail, suggesting the original signer do the split so they can sign their Sub commits. An option is provided to override that and have the current user personally sign the Sub commits split from the Main commits signed by others, or all Main commits signed or unsigned, with a strongly worded message advising the user to make these new mappings available to everyone else who works on Main; this is suitable for a centralized workflow like GitHub. This failure + overriding-with-advice encourages people to split out their own commits, or failing that, encourages a central source of truth for the commit maps.

This wouldn't be a problem at all if we could, at commit creation time, automatically sign subtree commits (while we're at it, transform the commit message then, e.g. removing the prefix, that way it can be an arbitrary transformation that changes commit to commit). We can't use hooks to enforce that though, since there will be people who make changes in path/to/sub/ who never interact with subhistory directly and therefore we never have a chance to install hooks.

laughinghan avatar Jan 20 '17 08:01 laughinghan

On further thought I think you're onto something with the idea of temporarily stripping signatures. In particular, maybe we can solve the problem of Main commits being split into differently signed commits by stripping signatures for purposes of calculating the merge base.

For example:

  1. Alice pushes signed Main commit affecting path/to/sub/.
  2. Bob splits Alice's commits out as Sub commits and uses option to personally sign them all.
  3. Bob submits to upstream a PR with the split-out Sub commits derived from Alice's commits.
  4. PR is merged, then further improvements to Sub are added to upstream on top of the PR.
  5. Charlotte pulls down upstream improvements to Sub and wishes to merge them into Main master.

The problem now is, git-subhistory on Charlotte's machine can't recreate the Sub commits from Bob's PR, because they were signed by Bob. But we can strip the signatures from the upstream Sub commits, and split out unsigned Sub commits from Main on Charlotte's machine (while saving a map between them), and calculate the merge bases between those, then map these unsigned merge bases back to Main commits; those Main commits are the ones on top of which we shall assimilate the upstream improvements to Sub, so that they'll be the merge bases between these assimilated improvements and Main master.

laughinghan avatar Jan 20 '17 09:01 laughinghan

And by "onto something" mean I've realized that this is exactly the right thing to do in the "strong correspondence" case (which I'm sketching out in #8).

Can you add some tests and docs for the second argument to split?

laughinghan avatar Jan 20 '17 19:01 laughinghan

Oh wow, didn't expect such a long comment on my PR 😆

I am using git-subhistory here (https://github.com/timetabio/) to publish a subset of subprojects belonging to a private repository namely "application" and "styles".

Submodules would have been just wrong for that use and subtrees behaved somewhat cumbersome so I ended up using subhistory.


I tried to add more detailed comments in https://github.com/laughinghan/git-subhistory/pull/7/commits/1684adb4ac8e45a446c5ac38a05d7bfc2f90c8cd.

Here's my take on explaining the changes in this PR: We create a mapping between synthetic commits, which are never signed, and commits from the subprojects remote (second argument to git-subhistory split).

In the filter-branch where we extract commits touching the subproject, we create a non-signed synthetic commit and check if we have a matching commit in the subproject and use that commit instead of the synthetic commit. Otherwise we create a new commit (signed, if the user passed -S)

This serves two purposed:

  • prevent signing of commits that already exist in the subproject which would result in non fast-forward changes.
  • prevent the creation of unsigned commits for commits that are signed in the subproject, again resulting in non fast-forward changes.

I agree that this makes the workflow somewhat cumbersome, but when I'm splitting, the next step is probably anyways to push to the remote branch?

--

Additionally, I added the possibility to sign new commits with -S or subhistory.gpgsign = true

it means different people splitting the same Main commits can produce different (-ly signed) Sub commits, right?

This should not happen as the mapping also ensures that unsigned commits in the subproject aren't overridden with signed commits.

tautropfli avatar Jan 22 '17 11:01 tautropfli

I'm no longer interested in continuing this PR. I'm going ahead and closing it :)

tautropfli avatar Jan 01 '24 23:01 tautropfli