scala-dev icon indicating copy to clipboard operation
scala-dev copied to clipboard

inodes on behemoths

Open lrytz opened this issue 5 years ago • 2 comments
trafficstars

Spinning off the discussion from https://github.com/scala/scala-dev/issues/732#issuecomment-728671079 into a new ticket

Indeed it looks like inodes is more likely the issue than actual disk space. On behemoth-1:

admin@ip-172-31-2-3:~$ df -hi
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/xvdj         25M   25M  816K   97% /home/jenkins

while disk space looks fine

/dev/xvdj       393G  244G  130G  66% /home/jenkins

The community build workspaces have huge numbers of files and directories. For example, for "scala-2.13.x-jdk11-integrate-community-build" there are currently 103 extraction directories

admin@ip-172-31-2-3:~$ ls /home/jenkins/workspace/scala-2.13.x-jdk11-integrate-community-build/target-0.9.17/extraction | wc -l
103

A single one of those has > 200k inodes:

admin@ip-172-31-2-3:~$ find /home/jenkins/workspace/scala-2.13.x-jdk11-integrate-community-build/target-0.9.17/extraction/82d4b745facaabd414be8c97dd8725a670038658 | wc -l
207593

Looking at things a bit, it seems we could save > 40% of inodes by not pulling in all the git refs to pull requests. They look similar to this:

/home/jenkins/workspace/scala-2.13.x-jdk11-integrate-community-build/target-0.9.17/extraction/82d4b745facaabd414be8c97dd8725a670038658/projects/93deaed81507c97b97bdf01b44a6723b14827dc1/.git/refs/pull/110

Some directory counting

admin@ip-172-31-2-3:~$ find /home/jenkins/workspace/scala-2.13.x-jdk11-integrate-community-build/target-0.9.17/extraction/82d4b745facaabd414be8c97dd8725a670038658 -type d | wc -l
80892
admin@ip-172-31-2-3:~$ find /home/jenkins/workspace/scala-2.13.x-jdk11-integrate-community-build/target-0.9.17/extraction/82d4b745facaabd414be8c97dd8725a670038658 -type d | grep '\/pull\/' | wc -l
43463

Looking at files in the extraction, again a large number of git refs corresponding to pull requests

admin@ip-172-31-2-3:~$ find /home/jenkins/workspace/scala-2.13.x-jdk11-integrate-community-build/target-0.9.17/extraction/82d4b745facaabd414be8c97dd8725a670038658 -type f | wc -l
126693
admin@ip-172-31-2-3:~$ find /home/jenkins/workspace/scala-2.13.x-jdk11-integrate-community-build/target-0.9.17/extraction/82d4b745facaabd414be8c97dd8725a670038658 -type f | grep -E 'pull\/[0-9]+\/head' | wc -l
43463

@SethTisue do you think we can do something about these git refs to pull requests?

lrytz avatar Nov 18 '20 11:11 lrytz

Hmm — I remember talking with @cunei once about the possibility of shallow-cloning. I'll see if I can dig that conversation up.

SethTisue avatar Nov 19 '20 02:11 SethTisue

Alternatively we can create a new EBS volume with a new file system where we explicitly specify the number of inodes on creation (https://askubuntu.com/questions/600159/how-can-i-create-an-ext4-partition-with-an-extra-large-number-of-inodes), then copy over the files.

lrytz avatar Nov 24 '20 14:11 lrytz