wireguard-operator
wireguard-operator copied to clipboard
Does not work with baseline pod security standard
Describe the bug
❯ k describe rs
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreate 109s replicaset-controller Error creating: pods "media-dep-878876c8d-vxz94" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
Warning FailedCreate 109s replicaset-controller Error creating: pods "media-dep-878876c8d-xz8fh" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
Warning FailedCreate 109s replicaset-controller Error creating: pods "media-dep-878876c8d-85956" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
Warning FailedCreate 109s replicaset-controller Error creating: pods "media-dep-878876c8d-bh8p7" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
Warning FailedCreate 109s replicaset-controller Error creating: pods "media-dep-878876c8d-ln28h" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
Warning FailedCreate 109s replicaset-controller Error creating: pods "media-dep-878876c8d-wjsrs" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
Warning FailedCreate 109s replicaset-controller Error creating: pods "media-dep-878876c8d-psmgq" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
Warning FailedCreate 109s replicaset-controller Error creating: pods "media-dep-878876c8d-ctlb4" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
Warning FailedCreate 108s replicaset-controller Error creating: pods "media-dep-878876c8d-qwstr" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
Warning FailedCreate 27s (x6 over 107s) replicaset-controller (combined from similar events): Error creating: pods "media-dep-878876c8d-fvh5h" is forbidden: violates PodSecurity "baseline:latest": non-default capabilities (containers "metrics", "agent" must not include "NET_ADMIN" in securityContext.capabilities.add)
To Reproduce
Run a Kubernetes cluster with the baseline pod security standard (e.g Talos).
https://kubernetes.io/docs/concepts/security/pod-security-admission/
Expected behavior
Optionally use the userspace wireguard implementation.
Screenshots
N/A
Additional context
Maybe the operator could remove the privileged security context if the user space implementation is being used?
Did you got any success running it atop of Talos?
I've added pod-security.kubernetes.io/enforce: privileged label to namespace - do you think it's safe and enough?
Did you got any success running it atop of Talos?
I've added
pod-security.kubernetes.io/enforce: privilegedlabel to namespace - do you think it's safe and enough?
I use Talos, and it works but it does need that label. A lot of projects need it unfortunately.
I ended up using this magic incantation to fix wireguard on Talos:
apiVersion: v1
kind: Namespace
metadata:
name: wireguard
labels:
pod-security.kubernetes.io/audit: privileged
pod-security.kubernetes.io/audit-version: latest
pod-security.kubernetes.io/enforce: privileged
pod-security.kubernetes.io/enforce-version: latest
pod-security.kubernetes.io/warn: privileged
pod-security.kubernetes.io/warn-version: latest
@Twi The only label which should be necessary is pod-security.kubernetes.io/enforce: privileged. The logs may complain without some of those other labels, but it will work.
Can this change be added to the project? I've never used tailos so I cannot test it :(. I'd really appreciate if you can add it!
Optionally use the userspace wireguard implementation.
I'm wondering if there is a way to detect that we are running on tailsos and we cannot run the kernal mode wireguard?
It would be a nice feature to have, though it is important to note this is not specific to Talos but any Kubernetes cluster which enforces the baseline pod security standard. There is already some fallback mechanism in place when creating the tunnel itself, but I believe the operator will need to also make changes to the pods too.
If we won’t get success with user space implementation - at least we can add notice about PodSecurity into README :)
Else it could take time for other guys to discover reason of issue
I wonder what the right way to do this is? I guess the first step is to add some configuration option to force user space (and remove NET_ADMIN from the security capabilities). A feature could then be built on top of that which automatically detects the current pod security standard? Not sure what the right default is. User space is likely to be less efficient, but more compatible.
I like the multiple phases approach ^^
- I wonder what the right way to do this is? I guess the first step is to add some configuration option to force user space (and remove NET_ADMIN from the security capabilities).
So there is currently a parameter in the wiregurad resource called useWgUserspaceImplementation
useWgUserspaceImplementation:
description: A boolean field that specifies whether to use the userspace
https://github.com/jodevsa/wireguard-operator/blob/main/config/crd/bases/vpn.wireguard-operator.io_wireguards.yaml#L72
this paremeter gets populated in the agent, which is the bootstraping software that actually runs wireguard. What is currently missing is that we need to stop populating the security capabilities if useWgUserspaceImplementation is true.
so around here: https://github.com/jodevsa/wireguard-operator/blob/main/pkg/controllers/wireguard_controller.go#L741
we need soemthing like
if m.spec.useWgUserspaceImplementation != true {
// inject the security capabilitiy
}
which automatically detects the current pod security standard
Any ideas on how we can detect that? is their a kubernetes configmap that can be read to know the allowed capabilities? I think that might be more straightforward than trying to run a pod with that capabilitiy and waiting to see if that fails
so, going back to what @uhthomas suggested, we have 2 phases to get this complete:
Phase 1: Do not use NET_ADMIN capability if wireguard.spec. useWgUserspaceImplementation is equal to true Phase 2: Detect the pod security standard and fallback to userspace implementation if we are not allowed to have NET_ADMIN capability
example of using the flag:
apiVersion: vpn.wireguard-operator.io/v1alpha1
kind: Wireguard
metadata:
name: vpn
spec:
useWgUserspaceImplementation: true