guardian
guardian copied to clipboard
gdn fail with runc error in ubuntu 2204 lts
Description
When running Concourse binary (using gdn
for containization) in google VM with ubuntu-2204-lts
family as OS image, we see errors as below
Aug 25 21:56:12 smoke-splendid-earwig concourse[4460]: {"timestamp":"2022-08-25T21:56:12.809930620Z","level":"error","source":"guardian","message":"guardian.create.containerizer-create.runtime-create-failed","data":{"error":"runc run: exit status 1: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: rootfs_linux.go:76: mounting \"cgroup\" to rootfs at \"/sys/fs/cgroup\" caused: invalid argument","handle":"a17876d5-647e-492d-6ae2-311b1a56d718","session":"40.3"}}
For comparison, when running Concourse by docker compose
locally we don't see the error. The OS image is the same as the VM in GCP
root@c29ddbf435bd:/src# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.1 LTS"
but is kernel is 5.10.47-linuxkit
.
Also, when running Concourse with containerd
runtime that directly using runc
v1.1.4 we dont see error in both local docker or gcp VM.
Maybe it is related to the older runc
that is currently used in guardian where it might not work well with specific newer kernel in ubuntu Jammy jellyfish?
- Guardian release version: 1.22
- Linux kernel version: 5.15.0-1016-gcp
- Concourse version: latest dev
- Go version: 1.19
We have created an issue in Pivotal Tracker to manage this. Unfortunately, the Pivotal Tracker project is private so you may be unable to view the contents of the story.
The labels on this github issue will be updated when the story is started.
This issue is being worked on under the Garden-runc-release/#233 issue
It looks like this is the same issue that other contain runtimes have had with Jammy: https://github.com/containers/podman/issues/12559 .
Jammy uses cgroupv2 in the kernel, and it delegates cgroup authority to sub-processes (like the container runtime) as cgroupv2. runc
supports cgroupv2 as of v1.0.0 release, but gdn
is also directly altering cgroups using the old v1 schema: https://github.com/cloudfoundry/guardian/blob/8deac7e439aca41e515a74d7c8489081b8961b97/guardiancmd/command_linux.go#L307
This will require some substantial changes in how cgroups are managed in guardian in order to support new distributions that have switched to cgroupv2.
Some updates:
Concourse with latest gdn can run successfully on an image with cgroups v1
enabled based on gcloud image family ubuntu-2204-lts
.