talos feat: mount /sys/kernel/security into kubelet

This allows the kubelet to enforce AppArmor.

Apr 23 '24 17:04 andrewrynhard

Since AppArmor landed with 1.7 kernel it probably makes sense to backport

Apr 23 '24 18:04 andrewrynhard

Since AppArmor landed with 1.7 kernel it probably makes sense to backport

I think getting this in to v1.7 fits nicely with the bump to Kubernetes 1.30 that includes the move of AppArmor config to Pod SecurityContext fields. https://github.com/kubernetes/kubernetes/pull/123435

Apr 24 '24 06:04 konrader

I tried to validate if this fixes the problem with AppArmor configured on Pod causing it to get stuck in Pending with message Cannot enforce AppArmor: AppArmor is not enabled on the host.

I mounted the securityfs in kubelet using machine config:

machine:
    kubelet:
        extraMounts:
            - destination: /sys/kernel/security
              type: bind
              source: /sys/kernel/security
              options:
                - bind
                - ro

Starting a Pod with AppArmor profile specified (that I have loaded into the node Kernel beforehand):

spec:
  containers:
  - name: test
  ...
    securityContext:
      appArmorProfile:
        type: Localhost
        localhostProfile: myprofile

I no longer get the message ...AppArmor is not enabled on the host error but a new type of error: Error: failed to get container spec opts: failed to generate apparmor spec opts: apparmor is not supported

Feels like there is something else that also needs to be fixed around the kublet.

Apr 24 '24 10:04 konrader

@konrader it might be easier if you could provide us with a full set of steps to launch a pod with AppArmor profile, which we could include into Talos as an integration test. This way we can develop a fix and a test.

Apr 24 '24 10:04 smira

I will write down the steps needed. It does include loading a AppArmor profile (with apparmor_parser cmdline tool) via a (somewhat) privileged container.

Where are the integration tests located today?

Apr 24 '24 11:04 konrader

I will write down the steps needed. It does include loading a AppArmor profile (with apparmor_parser cmdline tool) via a (somewhat) privileged container.

Where are the integration tests located today?

in the internal/integration, but if that only needs some Kubernetes interactions, this is perfectly fine as well

Apr 24 '24 11:04 smira

Here is a complete script for doing the test of both loading AppArmor profile into kernel and running a Pod with profile enabled.

#!/bin/sh

NS=apparmor

kubectl create namespace $NS

kubectl label ns $NS pod-security.kubernetes.io/enforce=privileged

kubectl apply -n $NS -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: apparmor
spec:
  terminationGracePeriodSeconds: 3
  containers:
  - name: apparmor
    image: ghcr.io/bifrostsec/apparmor-image/apparmor:22.04
    # Just spin & wait forever
    command: [ "/bin/sh", "-c", "--" ]
    args: [ "while true; do sleep 30; done;" ]
    securityContext:
      privileged: true
      capabilities:
        add:
          - MAC_ADMIN
    volumeMounts:
      - name: sysfs
        mountPath: /host/sys
        readOnly: false
      - name: securityfs
        mountPath: /sys/kernel/security
        readOnly: false
  volumes:
    - name: sysfs
      hostPath:
        path: /sys
    - name: securityfs
      hostPath:
        path: /sys/kernel/security
EOF
kubectl -n $NS wait --for=condition=Ready pod/apparmor

# Copy over simple AppArmor audit profile
kubectl -n $NS cp audit.profile apparmor:tmp/audit.profile
# Load it into host Kernel
kubectl -n $NS exec apparmor -it -- apparmor_parser /tmp/audit.profile

kubectl apply -n $NS -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: test
spec:
  terminationGracePeriodSeconds: 3
  containers:
  - name: test
    image: alpine:3.19.1
    # Just spin & wait forever
    command: [ "/bin/sh", "-c", "--" ]
    args: [ "while true; do sleep 30; done;" ]
    securityContext:
      appArmorProfile:
        type: Localhost
        localhostProfile: audit-all
EOF

This depends on AppArmor profile being in a separate file named audit.profile with this content:

abi <abi/3.0>,

profile audit-all flags=(attach_disconnected,mediate_deleted,complain,audit) {
}

Apr 24 '24 14:04 konrader

I've located the error failed to generate apparmor spec opts: apparmor is not supported to be coming from containerd https://github.com/containerd/containerd/blob/release/1.7/pkg/cri/server/container_create_linux.go#L505

which at the core relies on containerd's own check apparmor.HostSupports() to see if host supports AppArmor. https://github.com/containerd/containerd/blob/release/1.7/pkg/apparmor/apparmor_linux.go#L34

This check does more than kubelet already does, notably it also checks for /sbin/apparmor_parser being present which it's not on a Talos node.

On Talos (or other Linux platforms) where AppArmor profile loading is done either from other user-land tool or from inside container this check is not necessary since containerd requires the profile to be loaded into kernel before you reference it for a container anyway. There seems to be a bit of controversy around this with removing extra check and then reverting it back: https://github.com/containerd/containerd/pull/8087

This PR is still a step in the right direction since it gets rid of the problem with kubelet not detecting if AppArmor is enabled on the node.

Apr 25 '24 22:04 konrader

@konrader we can put an empty /sbin/apparmor_parser to the rootfs if that helps, but I don't fully understand how it broke Moby? would it break some workloads on Talos (not using AppArmor previously)?

Apr 29 '24 14:04 smira

I was actually thinking about adding empty /sbin/apparmor_parser to Talos rootfs to try but didn't have my build-instance up. Not sure either what the connection is between containerd and moby in regards to AppArmor. Moby says on their website site that they use containerd as the default container runtime, but still it's weird that they don't accept referencing AppArmor profiles (that is already loaded into kernel) for containers unless apparmor_parser is present (indicating that userland AppArmor tools are installed). It's a bit like checking if your car repair shop exists before you can start your car.

Apr 29 '24 14:04 konrader

I tried to add empty /sbin/apparmor_parser via machine config but that wasn't allowed.

machine:
    files:
        - content: ''
          permissions: 0o555
          path: /sbin/apparmor_parser
          op: create

Got this message during boot: create operation not allowed outside of /var: "/sbin/apparmor_parser"

What is the easiest way to add the file in rootfs to try out if this would work?

Apr 30 '24 09:04 konrader

What is the easiest way to add the file in rootfs to try out if this would work?

The only way is to rebuild Talos with changes to the Dockerfile to touch that file in the rootfs.

I can get to it with the steps you provided to build a test for AppArmor.

Apr 30 '24 11:04 smira

@konrader we can put an empty /sbin/apparmor_parser to the rootfs if that helps, but I don't fully understand how it broke Moby? would it break some workloads on Talos (not using AppArmor previously)?

I'm looking into why Moby/Docker want containerd to have the extra check for /sbin/apparmor_parser. It seems to be related to Moby applying a default AppArmor profile docker-default to started containers (unless overridden by security-opt option) https://docs.docker.com/engine/security/apparmor/#understand-the-policies

Moby removed their own AppArmor enabled check and started relying on containerd's check: https://github.com/moby/moby/pull/42276 Moby uses the result of this check to see if they need to parse and load docker-default profile, which they need apparmor_parser for. https://github.com/moby/moby/blob/master/profiles/apparmor/apparmor.go#L59 So some of their users got problem starting containers when AppArmor was enabled in Kernel but AppArmor userland tools weren't installed.

I do see that the default profile loading stuff is put under /contrib in containerd and it could possible be a problem with missing apparmor_parser if someone deploys a workload on Talos specifying it to apply the runtime's default profile, like this:

securityContext:
  appArmorProfile:
    type: RuntimeDefault

Docs: https://kubernetes.io/docs/tutorials/security/apparmor/#securing-a-pod

I will try to verify this.

Apr 30 '24 11:04 konrader

I could reproduce the scenario with Kubernetes 1.30 running on a Ubuntu 22.04 when I uninstalled AppArmor userland tools (apt remove apparmor and reboot since containerd caches the AppArmor enabled check result).

With no /sbin/apparmor_parser then I could not start Pod with AppArmor securityContext, got error message failed to generate apparmor spec opts: apparmor is not supported (same as on Talos after mounting /sys/kernel/security into kubelet).

I created an empty /sbin/apparmor_parser and when rebooting then various Kubernetes pods fails to start with error: kubelet[1706]: E0430 16:35:26.865258 1706 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-apiserver\" with CreateContainerError: \"failed to create containerd container: load apparmor profile /tmp/cri-containerd.apparmor.d3636700873: parser error(\\\"\\\"): exec: \\\"apparmor_parser\\\": executable file not found in $PATH\""

So it seem like the CRI implementation in containerd do try to apply some default AppArmor profile for all containers (that does not have specific AppArmor profile specified in security option).

If we do want AppArmor support in Talos then I see the following options:

Include real apparmor_parser in rootfs
Disable applying default profile to containers in containerd CRI

Apr 30 '24 14:04 konrader

It does not seem possible to disable automatic applying of a default AppArmor profile (called cri-containerd.apparmor.d) in containerd for containers started via CRI (the interface Kubernetes uses) and it has deemed AppArmor enabled on host. I assume this is to make all container runtimes have the same default behaviour: https://github.com/containerd/containerd/blob/main/internal/cri/server/container_create_linux.go#L202

If we would include apparmor_parser (1.5M binary) into the rootfs for everyone then we have the impact of people suddenly having their workloads running with a default profile without knowing it. The default profile is quite open but it do enforce denying access to certain things that people might not prepared for, see profile template here: https://github.com/containerd/containerd/blob/main/contrib/apparmor/template.go#L42

May 02 '24 13:05 konrader

@konrader thanks for your research here. I think having AppArmor (better security) is in the best spirit of Talos. We can think whether we want apparmor-parser as an extension for example, but I guess this becomes a bigger change, so would likely be in the timeframe of 1.8

May 02 '24 14:05 smira

I created issue #8715 to track the actual AppArmor work, while this PR makes sense on its own, so should be good to merge

May 07 '24 15:05 smira

/m

May 07 '24 15:05 smira

talos talos copied to clipboard

feat: mount /sys/kernel/security into kubelet

talos
talos copied to clipboard