bfg-repo-cleaner
bfg-repo-cleaner copied to clipboard
sometimes --strip-blobs-bigger-than and --strip-biggest-blobs do not work. repack ?
Simplest reproducible example:
git init test && cd test
seq 1000000 > seq.txt
git add seq.txt
git commit -m 'added seq'
git rm seq.txt
git commit -m 'deleted'
cd ..
git clone --mirror test
bfg -b 1K test.git/
Warning : no large blobs matching criteria found in packfiles - does the repo need to be packed?
bfg -B 10 test.git/
Warning : no large blobs matching criteria found in packfiles - does the repo need to be packed?
BFG aborting: No refs to update - no dirty commits found??
## let's repack
cd test && git repack
cd .. && git clone --mirror test
bfg -b 1K test.git
### NOW IT WORKS
The example looks silly, but the same problem occurred on a rather big real repository. If the repack really fixes the problem, I guess you should make it clear in the documentation, it should be the very first thing to do.
Thanks for describing a simple test case! It doesn't surpise me tho- the BFG does it's search for big objects in the packfile index - the quickest way to find out what the biggest objects are - and a 'pretend' repo like this wouldn't have any packfile yet. Generally, a real repo with big files probably would have them in packfile, at least if they were added some time ago or if you cloned it from a remote location. Could you give some background:
- was your real big repo local or remote (if remote, who was it hosted with?)
- was it shared with other people?
- how old was it?
- roughly how many objects were in it (just order of magnitude, eg 10, 100, 1000)?
- how long ago did the offending big files get added?
On Mon, Jan 5, 2015 at 9:07 PM, Roberto Tyley [email protected] wrote:
Thanks for describing a simple test case! It doesn't surpise me tho- the BFG does it's search for big objects in the packfile index - the quickest way to find out what the biggest objects are - and a 'pretend' repo like this wouldn't have any packfile yet.
You're welcome. I'm glad to contribute to this very useful tool !
Generally, a real repo with big files probably would have them in packfile, at least if they were added some time ago or if you cloned it from a remote location.
Hmm I'm not so sure. Of course it will have some packs, but some recently added or modified files could be unpacked. I can test it if you want, but I guess that if you repack, than add and commit a new file, then clone mirror, the new file will not be in a pack.
Could you give some background:
- was your real big repo local or remote (if remote, who was it hosted with?)
local
- was it shared with other people?
yes
- how old was it?
~ 1 year
- roughly how many objects were in it (just order of magnitude, eg 10, 100, 1000)?
~ 200
- how long ago did the offending big files get added?
~ 6 months ago
— Reply to this email directly or view it on GitHub https://github.com/rtyley/bfg-repo-cleaner/issues/65#issuecomment-68769470 .
I am not a git expert but I found the initial example provided by kforner helpful. I had a similar problem on a real repository and found that BFG --strip-blobs-bigger-than worked after manually packing using this command which apparently re-packs the repository:
$ git gc
Running "git gc" before following the tutorial on the main page solved my issue too. This should maybe be mentioned there as well. This tool is a godsend btw.
Ditto, running "git gc" fixed the issue for me too.
2015 issue but the solution still works in 2018 :). Add git gc
to docs, It saved my git repo too.
Updated documentation on this would be very helpful. Not many people would think to look in the github issues for this little Jewel. Running git gc
first fixed the issue with out any trouble. Very nice tool thanks for providing this to the community.
Just to say that I found this and it also solved my problem... would be great if it was added to the documentation. Should do it before you clone.
git-gc
didn't work for me, trying to clean this repo