git-filter-repo
git-filter-repo copied to clipboard
Is there a way to remove duplicate commits?
I used the BFG Repo cleaner to remove large files but forgot to clone fresh copies after pushing. Now my main trunk is full of duplicate commits. Is there a git-filter-repo command that can remove them?
So, I'm guessing that you did a git pull
, which merged the two different versions of history.
You'll want to find three different commits using git log
:
- The merge commit that combined all the old history with the rewritten history: we'll call this ${MERGE_COMMIT}
- The first commit after ${MERGE_COMMIT}: we'll call this ${FIRST_NEW_COMMIT}
- The final commit of the BFG rewritten history (this should be one of the parents of ${MERGE_COMMIT): we'll call this ${FINAL_COMMIT_OF_BFG_REWRITTEN_HISTORY} Each of these three should be sha1sums corresponding to the relevant commit. With these...
Solution 1
If all the commits in your history since that merge are not merge commits, then you could try rebasing your commits on top of the good history. Something like:
git rebase --onto ${FINAL_COMMIT_OF_BFG_REWRITTEN_HISTORY} ${MERGE_COMMIT}..HEAD
If you have any merge commits in your history since ${MERGE_COMMIT}, though, this would just mess things up.
Solution 2
Create a replace object that is a new commit like ${FIRST_NEW_COMMIT} but which has ${FINAL_COMMIT_OF_BFG_REWRITTEN_HISTORY} as its parent instead of having ${MERGE_COMMIT} as its parent. Then use filter-repo to rewrite the history:
git replace --graft ${FIRST_NEW_COMMIT} ${FINAL_COMMIT_OF_BFG_REWRITTEN_HISTORY}
git filter-repo --proceed
A word of caution: if you have multiple commits that have ${MERGE_COMMIT} as a parent, you'll need to create new graft commits for all of them. N such commits, means you'll need to run git replace --graft ...
N times. You only need to run git filter-repo --proceed
once, but it needs to be after all N git replace --graft ...
calls.
Solution 3
This one I can't give you any pseudo-code for. If you can do a filtering operation that will again modify the old commits to match the new commits, but which simultaneously is a no-op on the new commits, and which will remove the now-degenerate merge commit, that would also solve this problem. I don't remember details in terms of what additional modifications BFG makes (like [formerly OLDHASH]
and Former-commit-id:
and `
Summary
Anyway, between the three, I suspect solution #2 is the most robust and easiest. Does that help?