distributed-learning-contributivity
distributed-learning-contributivity copied to clipboard
The repo is way too heavy!
The repo is now 1.1GB which is not okay. It is likely that a dataset has been added somewhere. I will investigate this issue, but any help is welcome for this matter.
See for yourself (size key): curl https://api.github.com/repos/SubstraFoundation/distributed-learning-contributivity
You can even use: curl https://api.github.com/repos/SubstraFoundation/distributed-learning-contributivity 2> /dev/null | grep size | tr -dc '[:digit:]'
I deleted PVRL, Moving-functions, Add-Imdb-dataset[...] and dvrl. All these branches had been either dropped or rebased in an other branch, which has been merged
Great, thank you @arthurPignet! But the repo size seems to remain unchanged :/
Hello!
I investigated a little bit this problem and found this. It seems to come from the .git/objects/pack/ folder. Here you can find an explanation about what it is.
With this command line, we can see that there are some heavy files.
git verify-pack -v .git/objects/pack/pack-*.pack | grep -v chain | sort -k3nr | head
So I try to identify in the files in question which are so heavy. I run this command :
git rev-list --objects --all | grep "$(git verify-pack -v .git/objects/pack/*.pack | sort -k 3 -n | tail -10 | awk '{print$1}')"\
Here the results:

So it seems that we saved models in folders which were not ignored. I hope that helps :)
By the way, we really should separate code from its outputs (reports), which could be hosted on this open science oriented platform https://osf.io/. Besides, this would be totally relevant with a publication project (doi for assets, etc.)!
So, the target is: patience_sept_2020-09-07_17h37 from catastrophic forgetting, dossier resultats commit.
Thank you @celinejacques for the check!
Great to see that the target had been found ! @natct10 did you have the time to remove the commit ? Can we close this issue ?
By the way, we really should separate code from its outputs (reports), which could be hosted on this open science oriented platform https://osf.io/. Besides, this would be totally relevant with a publication project (doi for assets, etc.)!
I suggest to open a new issue to discuss about that, I think it's a good idea