gitea
gitea copied to clipboard
Repository size includes .git folder
Description
To reproduce on a local instance:
- create an empty repo (use default values for all fields in create repo form)
select size from repository order by created_unix desc limit 1;
What I expected: size is 0 because there are no files uploaded in repo
What happened: size is 82944 because of the .git folder.
Additional context:
Github API reports size 0 for empty repositories:
curl https://api.github.com/repos/{user}/{empty-repo}
Gitlab reports size 61.44 KiB for empty repos: https://gitlab.com/{user}/{empty-repo}/-/usage_quotas (they must have some optimisations done to reduce size?)
Feel free to close this issue if this is the expected behaviour.
Gitea Version
4f14c6de
Can you reproduce the bug on the Gitea demo site?
Yes
Log Gist
No response
Screenshots
No response
Git Version
2.30.2
Operating System
ubuntu 21.04
How are you running Gitea?
TAGS="bindata sqlite sqlite_unlock_notify" make backend && ./gitea web
Database
SQLite
imho this is expected behavior
Generally I'd say .git folder should be included as that size matters for the server and client when they do a (full) checkout. Still, I'd be interested in what optimization GitLab does 😉.
I suppose we have quite some sample git hooks etc in there
The calculation of the repository size is literally just the .git folder^1 + LFS. On the server there's no checkout-ed version of the default branch. IMO we could add a tooltip explaining that this represent the amount of storage that the git repository takes on the server and doesn't reflect the amount of storage that the checkout-ed version takes.
Checked out version could actually take up even more space than on server
Checked out version could actually take up even more space than on server
You likely could calculate the checkout version it via some weird git command. But then you still have the .git size that can also be huge depending the repositories, which you can only calculate by actually cloning the repository and then calculating that folder's size.
So GitHub API must then give the size of the checkout, but I think it's the less interesting metric of the two.
One more thing, it depends on that the size is calculated for "who".
If the size is calculated for users, then "empty" file/dir is 0 bytes, 1 byte file is 1 byte.
If the size is calculated for OS/filesystem, an "empty" or 1 byte file also occupy hundreds to thousands bytes on the filesystem (maybe tens of thousands, it really depends), a lot of empty files can also exhaust all space on a filesystem. A server admin may also care more about this size.
Maybe it could be better to clarify who the size is calculated for:
- for a simple result, only calculate the size for
.git/objectsdirectory (+ LFS). Then if the repo is empty, the size is 0. - for a general server admin purpose, calculate the repo directory (including config/refs/hooks, etc) (+ LFS).
- for a filesystem space usage purpose, calculate the occupied space on the filesystem, the result should be more or less similar if the git repo doesn't have large number of small files.
IMO "for a general server admin purpose" is enough at the moment.
So I think this issue could be close, until there are some new ideas about how to improve the display.