git-lfs
git-lfs copied to clipboard
After using Git LFS to manage files, the GitLab repository actually grew larger in size
Perform the following operation on a branch of the repository in Gitlab
git lfs prune
git lfs install
git lfs track "*.psd"
git lfs track "*.zip"
git lfs track "*.png"
git lfs track "*.jpg"
git lfs track "*.gif"
git lfs track "*.dll"
git lfs track "*.lib"
git lfs track "*.so"
git lfs track "*.tar.gz"
git lfs track "*.deb"
git lfs track "*.war"
git lfs track "*.pdb"
git lfs track "*.ssm"
git lfs track "*.bat"
git lfs track "*.pdf"
git lfs track "*.xlsx"
git lfs track "*.docx"
git lfs track "*.txt"
git lfs track "*.svg"
git lfs track "*.pptx"
git lfs track "*.exe"
git add .
git commit -m "Enable Git LFS"
# git lfs migrate import --include="*.zip *.psd *.png *.jpg *.gif *.dll *.lib *.so *.exe *.tar.gz *.deb *.rpm *.war *.pdb *.ssm *.bat *.db *.pdf *.xlsx *.docx *.pptx *.svg *.txt" --everything
git push --force
After submission, in the repository details interface of GitLab, the repository actually became larger, from 10GB to 13.3GB. Why is this
I don't know how GitLab stores LFS files and how it measures the repository size. However, I think it likely that it stores each LFS object individually and does not take advantage of their similarities. Compared to the Git pack format, this could especially increase the space consumption of *.docx files that have multiple versions. A *.docx file is a zip package in which each part is individually compressed; if multiple versions of a *.docx file include the same image, then the differences in other (e.g. text) parts of the file do not affect how the image is compressed, and Git pack compression can probably take advantage of this similarity, but the Git LFS server might not be able to do the same.
git lfs track "*.bat"
That seems a bad idea. In my experience, *.bat files are not very large, and if there are multiple versions, then the differences between versions are small.
Hey, I'm sorry for the trouble, and I hope you've contacted GitLab's support team and asked for their help in answering your question.
As this project is just for the open-source Git LFS client software, we usually can't answer questions about the internal operations of the various Git LFS hosting providers like GitLab, GitHub, etc. You will be best off contacting GitLab directly, as they should be able to explain what their system is doing in your case.
That said, I suspect @KalleOlaviNiemitalo's comments above are correct, that GitLab stores each Git LFS object separately.
As well, although you have run a git push --force command, GitLab's servers may not immediately prune your existing Git history and run a full Git garbage collection on your repository, so they could be storing the commits from both your previous Git history (which didn't use Git LFS) and your new one (which does use Git LFS).