sysbox
sysbox copied to clipboard
k8s: root cannot chown in emptyDir volume mount
Maybe this is expected, but I'm facing workloads that are failing under sysbox because root can't change the owner of files and directories in a shared volume mount.
Create my pod:
apiVersion: v1
kind: Pod
metadata:
name: sysbox-test
namespace: default
annotations:
io.kubernetes.cri-o.userns-mode: "auto:size=65536"
spec:
runtimeClassName: sysbox-runc
containers:
- name: ubu-bio-systemd-docker
image: registry.nestybox.com/nestybox/ubuntu-bionic-systemd-docker
command: ["/sbin/init"]
volumeMounts:
- mountPath: /tmp/share
name: share
- name: foo
image: ubuntu
command: ["/bin/sleep", "180"]
securityContext:
runAsUser: 999
runAsGroup: 999
volumeMounts:
- mountPath: /tmp/share
name: share
securityContext:
fsGroup: 999
restartPolicy: Always
volumes:
- name: share
emptyDir:
medium: Memory
Then from the shell on the system container:
kubectl exec -it sysbox-test -- /bin/bash
root@sysbox-test:/# mkdir /tmp/share/foo
root@sysbox-test:/# ls -lhd /tmp/share/foo/
drwxr-sr-x 2 root nogroup 40 May 25 18:20 /tmp/share/foo/
root@sysbox-test:/# chown 999:999 /tmp/share/foo/
chown: changing ownership of '/tmp/share/foo/': Operation not permitted
I can workaround, but this works without sysbox.
Hi @iamnoah, thanks for giving Sysbox a shot.
I suspect the fsgroup section is causing the problem; could you try without it?
That is, remove this:
securityContext:
fsGroup: 999
Thanks!
@ctalledo that does allow root to chown, but of course the volume no longer has setgid, and is not owned by gid 999.
Thanks @iamnoah; let me repro on my end and get back to you a bit later. I am out-of-office right now so response may be a bit delayed.
Hi @iamnoah, using the same yaml you provided, things work fine for me with the latest sysbox-deploy-k8s (v0.5.2):
root@sysbox-test:/tmp/share# mkdir /tmp/share/foo
root@sysbox-test:/tmp/share# l
total 0
drwxr-sr-x 2 root docker 40 May 27 16:00 foo
root@sysbox-test:/tmp/share# chown 999:999 /tmp/share/foo/
root@sysbox-test:/tmp/share# l
total 0
drwxr-sr-x 2 999 docker 40 May 27 16:00 foo
root@sysbox-test:/tmp/share# l -n
total 0
drwxr-sr-x 2 999 999 40 May 27 16:00 foo
Did you use that same version (kubectl -n kube-system describe <sysbox-deploy-k8s-pod-name> will show you); if not, could you try with it?
I was on 0.5.1 and tried bumping to 0.5.2 but get the same results. Resulting node describe:
Name: ip-10-0-146-171.ec2.internal
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/instance-type=m4.large
beta.kubernetes.io/os=linux
crio-runtime=running
eks.amazonaws.com/capacityType=ON_DEMAND
eks.amazonaws.com/nodegroup=terraform-20220523214611433500000001
eks.amazonaws.com/nodegroup-image=ami-00f8662da7d2ffc72
eks.amazonaws.com/sourceLaunchTemplateId=lt-0175e4cc4d15837ed
eks.amazonaws.com/sourceLaunchTemplateVersion=2
failure-domain.beta.kubernetes.io/region=us-east-1
failure-domain.beta.kubernetes.io/zone=us-east-1b
kubernetes.io/arch=amd64
kubernetes.io/hostname=ip-10-0-146-171
kubernetes.io/os=linux
node-group-ami-type=ubuntu
node.kubernetes.io/instance-type=m4.large
sandbox-pods-using=sysbox
sysbox-install=yes
sysbox-runtime=running
topology.kubernetes.io/region=us-east-1
topology.kubernetes.io/zone=us-east-1b
vpc.amazonaws.com/has-trunk-attached=false
Annotations: node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Tue, 31 May 2022 11:53:47 -0500
Taints: ReservedForSandboxedPod=sysbox:NoSchedule
Unschedulable: false
...
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
KernelDeadlock False Tue, 31 May 2022 12:28:39 -0500 Tue, 31 May 2022 11:58:32 -0500 KernelHasNoDeadlock kernel has no deadlock
ReadonlyFilesystem False Tue, 31 May 2022 12:28:39 -0500 Tue, 31 May 2022 11:58:32 -0500 FilesystemIsNotReadOnly Filesystem is not read-only
MemoryPressure False Tue, 31 May 2022 12:26:34 -0500 Tue, 31 May 2022 11:53:47 -0500 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Tue, 31 May 2022 12:26:34 -0500 Tue, 31 May 2022 11:53:47 -0500 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Tue, 31 May 2022 12:26:34 -0500 Tue, 31 May 2022 11:53:47 -0500 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Tue, 31 May 2022 12:26:34 -0500 Tue, 31 May 2022 11:55:07 -0500 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
...
Capacity:
attachable-volumes-aws-ebs: 39
cpu: 2
ephemeral-storage: 20263484Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 8139476Ki
pods: 20
Allocatable:
attachable-volumes-aws-ebs: 39
cpu: 1930m
ephemeral-storage: 17601085k
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 7550676Ki
pods: 20
System Info:
...
Kernel Version: 5.13.0-1023-aws
OS Image: Ubuntu 20.04.4 LTS
Operating System: linux
Architecture: amd64
Container Runtime Version: cri-o://1.21.7
Kubelet Version: v1.21.9
Kube-Proxy Version: v1.21.9
Do CRI-O and kubelet need to match for some reason?