bfg-repo-cleaner
bfg-repo-cleaner copied to clipboard
Add --blob-exec to run system commands
git-filter-branch
docs, describing BFG:
The command options are much more restrictive than git-filter branch, and dedicated just to the tasks of removing unwanted data- e.g: --strip-blobs-bigger-than 1M.
This is true. BFG's simple and easy options are (naturally) limited to a subset of all possible data manipulations.
This change adds --blob-exec
which can execute any system command or script, while still taking advantage of BFG's killer optimizations of (1) parallelism and (2) per-blob (rather than per-commit) rewrites. (Though it does concede the advantage of in-process-only operations.)
Users can use any scripting language or tool; Scala, awesome though it is, isn't well-known, and most Git users -- particularly the ones doing filter-branch type stuff -- are more familiar with command line tools.
A few use cases:
- I want to convert all tabs in my project to spaces, via the Unix
expand
utility. (This is more than a simple regex, as it considers mid-line tab positions.) - I want to convert all Windows line ending to Unix line endings with
dos2unix
. This is already possible with a replace expressions\r(\n)==>$1
, though that takes some trickery to figure out. - I want to run a code formatter like scalariform or js-beautify on each file.
There may be some improvements to this.
- Perhaps the filter options also apply?
- This uses Java's
Runtime.exec
whose tokenizer only uses spaces -- no quotes mechnanism. Git uses the default shell (somehow) in a portable way, which is nicer. - Documentation
Nice one! Very cool new feature!
Just curious: would the blob exec script also apply to filter the blob out of the tree?
Something like:
git filter-branch -f --tree-filter myFilterScript
My use case is renaming thousands of files in a large repository and keep the file history reachable without the need of using the --follow
flag on git log
to reach history prior to renaming.
Can we merge this? @rtyley any feedback? Some projects even manually apply this patch to BFG and recompile it from source because this PR isn't merged.
See https://github.com/cregit/cregit/blob/master/bfg/readme.org for example
Or probably merge #169 instead, but one of them would be useful.
I am the author of the patch linked above (cregit). I have emailed both roberto and paul about it, but never got a response from either one.
The big issue with the Pau's patch is that it is incomplete and does not properly handle the spawning of the process. My patch works well under linux (Paul code doesn't). I think it still needs testing. Also, I think its use needs a bit of know-how of how bfg works, since it cannot filter specific files by full path, only by basename. I will be happy to help making this patch part of the distribution.