bfg-repo-cleaner icon indicating copy to clipboard operation
bfg-repo-cleaner copied to clipboard

Add --blob-exec to run system commands

Open pauldraper opened this issue 10 years ago • 5 comments

git-filter-branch docs, describing BFG:

The command options are much more restrictive than git-filter branch, and dedicated just to the tasks of removing unwanted data- e.g: --strip-blobs-bigger-than 1M.

This is true. BFG's simple and easy options are (naturally) limited to a subset of all possible data manipulations.

This change adds --blob-exec which can execute any system command or script, while still taking advantage of BFG's killer optimizations of (1) parallelism and (2) per-blob (rather than per-commit) rewrites. (Though it does concede the advantage of in-process-only operations.)

Users can use any scripting language or tool; Scala, awesome though it is, isn't well-known, and most Git users -- particularly the ones doing filter-branch type stuff -- are more familiar with command line tools.


A few use cases:

  • I want to convert all tabs in my project to spaces, via the Unix expand utility. (This is more than a simple regex, as it considers mid-line tab positions.)
  • I want to convert all Windows line ending to Unix line endings with dos2unix. This is already possible with a replace expressions \r(\n)==>$1, though that takes some trickery to figure out.
  • I want to run a code formatter like scalariform or js-beautify on each file.

There may be some improvements to this.

  • Perhaps the filter options also apply?
  • This uses Java's Runtime.exec whose tokenizer only uses spaces -- no quotes mechnanism. Git uses the default shell (somehow) in a portable way, which is nicer.
  • Documentation

pauldraper avatar Feb 23 '15 05:02 pauldraper

Nice one! Very cool new feature!

Just curious: would the blob exec script also apply to filter the blob out of the tree? Something like: git filter-branch -f --tree-filter myFilterScript

My use case is renaming thousands of files in a large repository and keep the file history reachable without the need of using the --follow flag on git log to reach history prior to renaming.

jfoliveira avatar Mar 10 '15 17:03 jfoliveira

Can we merge this? @rtyley any feedback? Some projects even manually apply this patch to BFG and recompile it from source because this PR isn't merged.

copumpkin avatar May 01 '17 19:05 copumpkin

See https://github.com/cregit/cregit/blob/master/bfg/readme.org for example

copumpkin avatar May 01 '17 19:05 copumpkin

Or probably merge #169 instead, but one of them would be useful.

copumpkin avatar May 01 '17 19:05 copumpkin

I am the author of the patch linked above (cregit). I have emailed both roberto and paul about it, but never got a response from either one.

The big issue with the Pau's patch is that it is incomplete and does not properly handle the spawning of the process. My patch works well under linux (Paul code doesn't). I think it still needs testing. Also, I think its use needs a bit of know-how of how bfg works, since it cannot filter specific files by full path, only by basename. I will be happy to help making this patch part of the distribution.

dmgerman avatar May 01 '17 22:05 dmgerman