kube-spawn
kube-spawn copied to clipboard
No space left on device /var/lib/machines (again)
To Reproduce:
- Install Fedora 28 from https://cloud.fedoraproject.org/ (GP2 image) on AWS:
- m4.large
- Disk: at least 50GiB
- ssh:
ssh -i ~/.ssh/$KEY fedora@$IP
- Start a kube-spawn Kubernetes cluster on the AWS EC2 instance:
export KUBERNETES_VERSION=v1.9.9 # or other version
export KUBE_SPAWN_VERSION=master
sudo setenforce 0
sudo dnf install -y btrfs-progs git go iptables libselinux-utils polkit qemu-img systemd-container make docker
mkdir go
export GOPATH=$HOME/go
curl -fsSL -O https://github.com/containernetworking/plugins/releases/download/v0.6.0/cni-plugins-amd64-v0.6.0.tgz
sudo mkdir -p /opt/cni/bin
sudo tar -C /opt/cni/bin -xvf cni-plugins-amd64-v0.6.0.tgz
mkdir -p $GOPATH/src/github.com/kinvolk
cd $GOPATH/src/github.com/kinvolk
git clone https://github.com/kinvolk/kube-spawn.git
cd kube-spawn/
git checkout $KUBE_SPAWN_VERSION
make DOCKERIZED=n
sudo make install
sudo -E kube-spawn create --kubernetes-version $KUBERNETES_VERSION
sudo -E kube-spawn start --nodes=3
And I get the error:
Got 17% of https://alpha.release.flatcar-linux.net/amd64-usr/current/flatcar_developer_container.bin.bz2. 1min 31s left at 4.9M/s.
Failed to write file: Success
Failed to write file: Success
Failed to write file: No space left on device
Failed to retrieve image file. (Wrong URL?)
Exiting.
Failed to start cluster: error running machinectl pull-raw: exit status 1
/var/lib/machines
is full.
Expected outcome
- [ ] troubleshooting.md should explain how to increase the size of
/var/lib/machines
. - [ ] Issue https://github.com/kinvolk/kube-spawn/issues/66 and PR https://github.com/kinvolk/kube-spawn/pull/70 were closed/merged but I still have the issue.
Manually running the workaround suggested in #70 seems to work:
sudo umount /var/lib/machines
sudo qemu-img resize -f raw /var/lib/machines.raw $((10*1024*1024*1024))
sudo mount -t btrfs -o loop /var/lib/machines.raw /var/lib/machines
sudo btrfs filesystem resize max /var/lib/machines
sudo btrfs quota disable /var/lib/machines
It seems I cannot run the workaround on a fresh install of Fedora because /var/lib/machines
does not exist yet. I have to first run into the error, then apply the workaround, and then try kube-spawn again. I guess that's why #70 didn't work.
$ sudo umount /var/lib/machines
umount: /var/lib/machines: not mounted.
$ sudo qemu-img resize -f raw /var/lib/machines.raw $((10*1024*1024*1024))
qemu-img: Could not open '/var/lib/machines.raw': Could not open '/var/lib/machines.raw': No such file or directory
$ sudo mount -t btrfs -o loop /var/lib/machines.raw /var/lib/machines
mount: /var/lib/machines: failed to setup loop device for /var/lib/machines.raw.
$ sudo btrfs filesystem resize max /var/lib/machines
ERROR: not a btrfs filesystem: /var/lib/machines
$ sudo btrfs quota disable /var/lib/machines
ERROR: not a btrfs filesystem: /var/lib/machines
@alban can you say what steps you took on a fresh Fedora system, would like to add it to the docs.
@schu Do you mean the steps to work around the issue? See the steps in https://github.com/kinvolk/kube-spawn/issues/282, search for First attempt to use kube-spawn
and Workaround for "no space left on device"
You can run e.g. 'sudo machinectl set-limit 20G' before you launch the first machine, this will set the max limit prior to it creating the btrfs.
@donbowman Yes, we can document that approach.
Anyway to fix this issue, we need to merge https://github.com/kinvolk/kube-spawn/pull/283, which looks good to me. I'm thinking about merging it tomorrow, if there's no objection. Documentation is still in progress, so I can make a follow-up PR to address the documentation issue.
Hmm, I didn't mean to close it. Will reopen it, as there's a documentation issue left.