multiNLI icon indicating copy to clipboard operation
multiNLI copied to clipboard

Large log files are stored in git history

Open leezu opened this issue 7 years ago • 0 comments

The repository is quite big and therefore takes a long time to clone.

Checking with the git_find_big.sh script from https://confluence.atlassian.com/bitbucket/reduce-repository-size-321848262.html shows that this space is taken up by log files. While they were removed from the master, they are still present in the history. It would be great if you can purge them from the history so that the repository will only contain the useful code and cloning will be fast again :)

Essentially git filter-branch --tree-filter 'rm -rf logs' --prune-empty HEAD will do the trick.

➜  multiNLI git:(master) ✗ ./git_find_big.sh 
All sizes are in kB's. The pack column is the size of the object, compressed, inside the pack file.
size   pack   SHA                                       location
21855  15642  00839330cdaa9d77494a20cc8cd6cbc5ae37af00  logs/cbow.ckpt
7328   6706   76f5dc6d0ee336aa682f75c4cc190f36354aef91  logs/cbow.ckpt.meta
16     2      6694f57265bf02223b4de3d1c43b1384fc28321b  logs/cbow.log
16     4      490afc7354b3ac06edf8c941b31fb0408febba92  python/ebim/ebim-test.py
8      3      529ca624d015fe3746138ce0d4dac963b3040fed  LICENSE.txt
7      2      9901c21cf00a93a817dadf2af784bad3ec04060e  cbow.py
6      2      8d9922f80c0878f9482f0c7ac97ec6dce8aabce4  README.md
6      0      5008ddfcf53c02e82d7eee2e57c38e5672ef89f6  logs/.DS_Store
5      1      249b65b3e5baec5095bb8d7d4f13551d9a995dfb  python/ebim/ebim_box.py
5      2      cd710d6bcc1b947e64b7aa92a2a1a3705c5b2e27  python/train_snli.py

leezu avatar Jul 17 '17 02:07 leezu