talos icon indicating copy to clipboard operation
talos copied to clipboard

Support for pod user namespaces

Open Piccirello opened this issue 10 months ago • 8 comments

This issue is to track Talos's support for user namespaces^0 in Kubernetes pods. User namespaces allow for strict separation between the root user in pods and the root user on the host. From the docs: "A process running as root in a container can run as a different (non-root) user in the host; in other words, the process has full privileges for operations inside the user namespace, but is unprivileged for operations outside the namespace."

User namespaces requires at least Linux 6.3, which it appears Talos v1.7.0 will support. The Kubernetes docs also state that "containerd v1.7 is not compatible with the userns support in Kubernetes v1.27 to v1.29." That may mean waiting for containerd 2.0^1, though this is unclear to me.

When user namespaces are eventually supported, it would be worth mentioning as a feature in the Talos release's changelog.

Piccirello avatar Apr 05 '24 17:04 Piccirello

How will this affect Talos Linux nodes running inside containers? And potentially, in user-namespaced/rootless containers?

sanmai-NL avatar Apr 16 '24 11:04 sanmai-NL

The kubernetes docs page^1 linked above has been updated with more information. It now seems more explicitly clear that containerd v2 is needed.

containerd v1.7 is not compatible with the userns support in Kubernetes v1.27 to v1.30. Kubernetes v1.25 and v1.26 used an earlier implementation that is compatible with containerd v1.7, in terms of userns support.

Piccirello avatar May 01 '24 18:05 Piccirello

Based on #8766, #8777, and #8484, it appears Talos 1.8 will use containerd 2.0. That may mean that pod user namespaces will be supported in Talos 1.8.

Piccirello avatar Jun 17 '24 19:06 Piccirello

Yes, it should be ready for that, I believe there's nothing to be done on Talos OS side itself to support that. If you have a good testcase for user namespaces (e.g. something you can kubectl apply), we'd be happy to get it into the integration tests. Thanks!

smira avatar Jun 28 '24 16:06 smira

It appears that there are two feature gates that need to be enabled: UserNamespacesSupport and UserNamespacesPodSecurityStandards. Both of these currently default to false, with the latter's effect described here.

I love the idea of including a testcase. The KEP states that if the runtime doesn't support user namespaces, a deployment with hostUsers: false will fail to be created. A sample Pod definition is provided for testing. In my testing against Kubernetes v1.30.1 on Talos 1.7.3, that Pod definition is deployed without issue, despite the Pod running in the host user namespace (confirmed with cat /proc/self/uid_map). I don't know why this works, though I suspect it's because the feature gate is disabled. If the feature gate were enabled, I would expect the deployment to fail due to the use of containerd v1.7. I'm not sure that this presents a clear testcase though, as it sounds like there would only be an error produced when the feature gate is enabled AND containerd <1.8 is used. Ideally the test case would fail whenever the host user namespace is used (i.e. including when the feature gate is disabled).

Piccirello avatar Jun 28 '24 21:06 Piccirello

I think there some bug with kubelet, it never fails and the mappings inside the pod are completely wrong, I would have expected kubelet to fail to create or throw an error and that is not the expected behavior, tested by adding the feature gate

frezbo avatar Jul 30 '24 18:07 frezbo

More updates on this, this feature seems to give a false sense of security, if the feature gate is not enabled a pod with hostUsers: false set would be happily scheduled and running on a node that does not meet any requirement for user namespaces, this seems weird and seems like a security issue giving a false sense of security.

With the feature gate enabled and hostUsers: false set the pod fails to be scheduled with this error:

failed to mount rootfs component: no space left on device

which in indeed a red herring and might be some other issue masked by this error

frezbo avatar Jul 31 '24 12:07 frezbo

Created an issue in k8s https://github.com/kubernetes/kubernetes/issues/126484

frezbo avatar Jul 31 '24 15:07 frezbo

@smira I believe Talos should enable the above feature gates by default in v1.8.

With the feature gate enabled and hostUsers: false set the pod fails to be scheduled with this error:

failed to mount rootfs component: no space left on device

My guess is that this error is specific to Talos's read-only environment.

Piccirello avatar Sep 17 '24 00:09 Piccirello

kubelet doesn't provide more info on how to debug this, so we're at a loss

frezbo avatar Sep 17 '24 02:09 frezbo