bfg-repo-cleaner
bfg-repo-cleaner copied to clipboard
How to get rid of the "Former-commit-id"
I noticed after I ran bfg
and scrubbed out sesntivie data, the overwritten commit started to read things like:
Former-commit-id: abcd123
And after clicking on the link to the former commit ID, I found the sensitive data was still there!
Example: https://github.com/esend7881/udacity-android-nanodegree--july2015-project1/commit/c46a009a990268419020cdba8aa00869a27f4c56 In that you can click on the former id and see the old code.
How can I delete all the overwritten commits entirely?
Can you tell me what command line params you passed to the BFG? Delete files, strip blobs, replace text, etc?
The BFG is only supposed to add Former-commit-id:
commit footers when the operation is 'public', rather than 'private' data removal. By default, 'public' means removing 'big' files, ie removing files by size - but when you delete files by name, that's private by default. You can also pass a -private
flag to force the BFG to treat the operation as private. So it's interesting to me to know whether you used --delete-files
or not.
To answer your first question, I basically followed the steps on your home page https://rtyley.github.io/bfg-repo-cleaner/
Since the terminal is not with me now, this is what I remember:
$ git clone --mirror git://example.com/some-big-repo.git
$ java -jar bfg.jar --strip-biggest-blobs 1 some-big-repo.git
# - Note -- using "1" because the text I am scrubbing is small (and the repo is small too)
$ cd some-big-repo.git
$ git reflog expire --expire=now --all && git gc --prune=now --aggressive
$ cd ..
$ bfg --replace-text passwords.txt my-repo.git
# - Where `password.txt` contains the password exposed.
$ cd some-big-repo.git
$ git reflog expire --expire=now --all && git gc --prune=now --aggressive
$ git push
This appeared to work -- the output text showed it was successful, but it created the Former-commit-id
.
Could I redo the above steps on the repo as is, but pass in the -private
flag? Would doing that get rid of all occurrences of the password, including in the former id's?
Any updates on this?
Related: #139 #293 #140 .
This is an old question but here's what I used to fix that:
$ git filter-branch --msg-filter 'sed -E "s/Former-commit-id: [0-9a-f]{40}//g"' --tag-name-filter cat -- --all
It is quite long to run on a large repo however.
git filter-branch --msg-filter 'sed -E "s/Former-commit-id: [0-9a-f]{40}//g"' --tag-name-filter cat -- --all
Is there a git filter-repo equivalent?
git filter-repo --message-callback 'return re.sub(b"Former-commit-id: [0-9a-f]{40}.*", b"", message, flags=re.MULTILINE)'