bfg-repo-cleaner icon indicating copy to clipboard operation
bfg-repo-cleaner copied to clipboard

sometimes --strip-blobs-bigger-than and --strip-biggest-blobs do not work. repack ?

Open kforner opened this issue 10 years ago • 9 comments

Simplest reproducible example:

git init test && cd test
seq 1000000 > seq.txt
git add seq.txt 
git commit -m 'added seq'
git rm seq.txt 
git commit -m 'deleted'

cd ..
git clone --mirror test
bfg -b 1K test.git/
Warning : no large blobs matching criteria found in packfiles - does the repo need to be packed?

bfg -B 10 test.git/
Warning : no large blobs matching criteria found in packfiles - does the repo need to be packed?
BFG aborting: No refs to update - no dirty commits found??

## let's repack
cd test && git repack

cd .. && git clone --mirror test
bfg -b 1K test.git
### NOW IT WORKS

The example looks silly, but the same problem occurred on a rather big real repository. If the repack really fixes the problem, I guess you should make it clear in the documentation, it should be the very first thing to do.

kforner avatar Jan 05 '15 17:01 kforner

Thanks for describing a simple test case! It doesn't surpise me tho- the BFG does it's search for big objects in the packfile index - the quickest way to find out what the biggest objects are - and a 'pretend' repo like this wouldn't have any packfile yet. Generally, a real repo with big files probably would have them in packfile, at least if they were added some time ago or if you cloned it from a remote location. Could you give some background:

  • was your real big repo local or remote (if remote, who was it hosted with?)
  • was it shared with other people?
  • how old was it?
  • roughly how many objects were in it (just order of magnitude, eg 10, 100, 1000)?
  • how long ago did the offending big files get added?

rtyley avatar Jan 05 '15 20:01 rtyley

On Mon, Jan 5, 2015 at 9:07 PM, Roberto Tyley [email protected] wrote:

Thanks for describing a simple test case! It doesn't surpise me tho- the BFG does it's search for big objects in the packfile index - the quickest way to find out what the biggest objects are - and a 'pretend' repo like this wouldn't have any packfile yet.

You're welcome. I'm glad to contribute to this very useful tool !

Generally, a real repo with big files probably would have them in packfile, at least if they were added some time ago or if you cloned it from a remote location.

Hmm I'm not so sure. Of course it will have some packs, but some recently added or modified files could be unpacked. I can test it if you want, but I guess that if you repack, than add and commit a new file, then clone mirror, the new file will not be in a pack.

Could you give some background:

  • was your real big repo local or remote (if remote, who was it hosted with?)

local

  • was it shared with other people?

yes

  • how old was it?

~ 1 year

  • roughly how many objects were in it (just order of magnitude, eg 10, 100, 1000)?

~ 200

  • how long ago did the offending big files get added?

~ 6 months ago

— Reply to this email directly or view it on GitHub https://github.com/rtyley/bfg-repo-cleaner/issues/65#issuecomment-68769470 .

kforner avatar Jan 06 '15 12:01 kforner

I am not a git expert but I found the initial example provided by kforner helpful. I had a similar problem on a real repository and found that BFG --strip-blobs-bigger-than worked after manually packing using this command which apparently re-packs the repository:

$ git gc

rkboyce avatar May 19 '15 09:05 rkboyce

Running "git gc" before following the tutorial on the main page solved my issue too. This should maybe be mentioned there as well. This tool is a godsend btw.

svermeulen avatar Nov 30 '15 18:11 svermeulen

Ditto, running "git gc" fixed the issue for me too.

fawda123 avatar Jul 24 '16 15:07 fawda123

2015 issue but the solution still works in 2018 :). Add git gc to docs, It saved my git repo too.

tkossak avatar Jul 19 '18 14:07 tkossak

Updated documentation on this would be very helpful. Not many people would think to look in the github issues for this little Jewel. Running git gc first fixed the issue with out any trouble. Very nice tool thanks for providing this to the community.

Morketh avatar Oct 18 '18 22:10 Morketh

Just to say that I found this and it also solved my problem... would be great if it was added to the documentation. Should do it before you clone.

apike02 avatar Mar 12 '19 11:03 apike02

git-gc didn't work for me, trying to clean this repo

sabianroberts avatar Jun 21 '20 14:06 sabianroberts