contour
contour copied to clipboard
Supply the contour configurations with PodSecurityPolicy and related RBAC configs
What steps did you take and what happened:
I deployed contour onto a cluster that has PSP enabled. And most of the things didn't work.
$ kubectl apply -f examples/contour
After that this is the pod status:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
contour-6f4746b855-4cpj9 0/1 ContainerCreating 0 10m
contour-6f4746b855-cvn9k 0/1 ContainerCreating 0 10m
contour-certgen-kz4nt 0/1 CreateContainerConfigError 0 11m
contour-certgen-kz4ntis inCreateContainerConfigErrorbecause the process inside it is running as root.- Daemonset
envoydid not come up because it needs elevated privileges and this is what I see in the events:
4m42s Warning FailedCreate daemonset/envoy
Error creating: pods "envoy-" is forbidden: unable to validate against any pod
security policy: [spec.containers[0].hostPort: Invalid value: 80: Host port 80
is not allowed to be used. Allowed ports: [] spec.containers[0].hostPort: Invalid
value: 443: Host port 443 is not allowed to be used. Allowed ports: []]
- Deployment contour has been waiting for the volumes to be mounted that will be populated by the job:
3m37s Warning FailedMount pod/contour-6f4746b855-cvn9k
Unable to attach or mount volumes: unmounted volumes=[contourcert cacert],
unattached volumes=[contourcert cacert contour-config contour-token-8ntjh]:
timed out waiting for the condition
What did you expect to happen:
I expect the contour project to ship PSP (and related RBAC configs) in the example/contour directory. So that when I run following command all the components get deployed properly without issues.
kubectl apply -f examples/contour
Anything else you would like to add:
Events during the deployment:
$ kubectl get events
LAST SEEN TYPE REASON OBJECT MESSAGE
<unknown> Normal Scheduled pod/contour-6f4746b855-4cpj9 Successfully assigned projectcontour/contour-6f4746b855-4cpj9 to suraj-cluster-helium-worker-0
3m59s Warning FailedMount pod/contour-6f4746b855-4cpj9 MountVolume.SetUp failed for volume "cacert" : secret "cacert" not found
117s Warning FailedMount pod/contour-6f4746b855-4cpj9 MountVolume.SetUp failed for volume "contourcert" : secret "contourcert" not found
3m35s Warning FailedMount pod/contour-6f4746b855-4cpj9 Unable to attach or mount volumes: unmounted volumes=[cacert contourcert], unattached volumes=[cacert contour-config contour-token-8ntjh contourcert]: timed out waiting for the condition
5m53s Warning FailedMount pod/contour-6f4746b855-4cpj9 Unable to attach or mount volumes: unmounted volumes=[contourcert cacert], unattached volumes=[contour-token-8ntjh contourcert cacert contour-config]: timed out waiting for the condition
<unknown> Normal Scheduled pod/contour-6f4746b855-cvn9k Successfully assigned projectcontour/contour-6f4746b855-cvn9k to suraj-cluster-helium-worker-2
117s Warning FailedMount pod/contour-6f4746b855-cvn9k MountVolume.SetUp failed for volume "contourcert" : secret "contourcert" not found
3m59s Warning FailedMount pod/contour-6f4746b855-cvn9k MountVolume.SetUp failed for volume "cacert" : secret "cacert" not found
8m8s Warning FailedMount pod/contour-6f4746b855-cvn9k Unable to attach or mount volumes: unmounted volumes=[contourcert cacert], unattached volumes=[contour-config contour-token-8ntjh contourcert cacert]: timed out waiting for the condition
5m52s Warning FailedMount pod/contour-6f4746b855-cvn9k Unable to attach or mount volumes: unmounted volumes=[cacert contourcert], unattached volumes=[cacert contour-config contour-token-8ntjh contourcert]: timed out waiting for the condition
3m37s Warning FailedMount pod/contour-6f4746b855-cvn9k Unable to attach or mount volumes: unmounted volumes=[contourcert cacert], unattached volumes=[contourcert cacert contour-config contour-token-8ntjh]: timed out waiting for the condition
10m Normal SuccessfulCreate replicaset/contour-6f4746b855 Created pod: contour-6f4746b855-cvn9k
10m Normal SuccessfulCreate replicaset/contour-6f4746b855 Created pod: contour-6f4746b855-4cpj9
<unknown> Normal Scheduled pod/contour-certgen-kz4nt Successfully assigned projectcontour/contour-certgen-kz4nt to suraj-cluster-helium-worker-1
12s Normal Pulling pod/contour-certgen-kz4nt Pulling image "docker.io/projectcontour/contour:master"
8m53s Normal Pulled pod/contour-certgen-kz4nt Successfully pulled image "docker.io/projectcontour/contour:master"
8m53s Warning Failed pod/contour-certgen-kz4nt Error: container has runAsNonRoot and image will run as root
10m Normal SuccessfulCreate job/contour-certgen Created pod: contour-certgen-kz4nt
10m Normal ScalingReplicaSet deployment/contour Scaled up replica set contour-6f4746b855 to 2
4m42s Warning FailedCreate daemonset/envoy Error creating: pods "envoy-" is forbidden: unable to validate against any pod security policy: [spec.containers[0].hostPort: Invalid value: 80: Host port 80 is not allowed to be used. Allowed ports: [] spec.containers[0].hostPort: Invalid value: 443: Host port 443 is not allowed to be used. Allowed ports: []]
The pod contour pods picked up following PSP
$ kubectl get pods contour-6f4746b855-4cpj9 -o yaml | grep -i psp
kubernetes.io/psp: restricted
and same for the job pod:
$ kubectl get pods contour-certgen-kz4nt -o yaml | grep psp
kubernetes.io/psp: restricted
Here is what the PSP restricted looks like:
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: restricted
annotations:
seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default'
seccomp.security.alpha.kubernetes.io/defaultProfileName: 'docker/default'
spec:
privileged: false
# Required to prevent escalations to root.
allowPrivilegeEscalation: false
# This is redundant with non-root + disallow privilege escalation,
# but we can provide it for defense in depth.
requiredDropCapabilities:
- KILL
- MKNOD
- SETUID
- SETGID
# Allow core volume types.
volumes:
- 'configMap'
- 'emptyDir'
- 'projected'
- 'secret'
- 'downwardAPI'
# Assume that persistentVolumes set up by the cluster admin are safe to use.
- 'persistentVolumeClaim'
hostNetwork: false
hostIPC: false
hostPID: false
runAsUser:
# Require the container to run without root privileges.
rule: 'MustRunAsNonRoot'
seLinux:
rule: 'RunAsAny'
supplementalGroups:
rule: 'MustRunAs'
ranges:
# Forbid adding the root group.
- min: 1
max: 65535
fsGroup:
rule: 'MustRunAs'
ranges:
# Forbid adding the root group.
- min: 1
max: 65535
readOnlyRootFilesystem: false
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: restricted-psp
rules:
- apiGroups: ['policy']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames:
- restricted
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: restricted-psp-system-authenticated
roleRef:
kind: ClusterRole
name: restricted-psp
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: Group
name: system:authenticated
apiGroup: rbac.authorization.k8s.io
Environment:
- Contour version:
Deployed the contour at following version 44d2aa6b02f91c8468be0369dcfa58c5a7388523
- Kubernetes version:
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.1", GitCommit:"d647ddbd755faf07169599a625faf302ffc34458", GitTreeState:"clean", BuildDate:"2019-10-02T17:01:15Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:09:08Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
- Kubernetes installer & version:
https://github.com/kinvolk/lokomotive-kubernetes/ at version fe3cf9190e054642dff88e1848f4edc64636dfa7
- Cloud provider or hardware configuration:
Packet cloud, with t1.small for master node and s1.large for workers.
- OS:
$ cat /etc/os-release
NAME="Flatcar Linux by Kinvolk"
ID=flatcar
ID_LIKE=coreos
VERSION=2079.5.0
VERSION_ID=2079.5.0
BUILD_ID=2019-06-05-1748
PRETTY_NAME="Flatcar Linux by Kinvolk 2079.5.0 (Rhyolite)"
ANSI_COLOR="38;5;75"
HOME_URL="https://flatcar-linux.org/"
BUG_REPORT_URL="https://issues.flatcar-linux.org"
FLATCAR_BOARD="amd64-usr"
I will be willing to take this forward if nobody minds.
/cc @youngnick
On 13 Nov 2019, at 14:53, Suraj Deshmukh [email protected] wrote:
I will be willing to take this forward if nobody minds.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Thanks for this @surajssd, this looks interesting, as does your related PR. I will check it out in detail today.
I've checked out the PR, and in an initial pass, it looks okay, with some changes. Like I said in the PR though, I don't want to approve this without giving it a proper run and test myself, and I won't have time in the next week. It's been a little while since I did PSP stuff, and I want to verify I haven't missed anything else.
Sorry about the delay, this is great work and I would like to get it in when we can.
After talking to a few people (including @surajssd) at Kubecon, I don't think that adding a PSP is the right way to go here.
Effectively, what we need to do is:
- document the security capabilities that the Contour and Envoy pods need to do their jobs
- Ensure that they are applied to the pods.
In addition, there is one big problem with the current implementation of PSPs: there is no way to ensure that the PSP you create will be applied. If multiple PSPs match, then the first in alphabetical order will be applied. This is an acknowledged weakness of the current design, and is one of the reasons that PSPs in their current form are likely to be deprecated, and replaced with something else.
To that end, I think that using the pod.spec.securityContext would be the right way to go here, and have that set what the pod requires. That way, we can have the pod's security requirements fit into one of the common security buckets, or people can easily create their own, whatever security feature they use.
I was about to open an issue around hardening the default deployment of contour and envoy.
Agreed that contour should not ship a PSP, but instead set the security context to the minimum privileges required.
I experimented with this a bit and ran into an issue with hardening Envoy. I tried setting user != root, dropping all capabilities and adding the NET_BIND_SERVICE to no success. It seems like there is currently no way to specify capabilities to non-root users in Kubernetes. See https://github.com/kubernetes/kubernetes/issues/56374
The nginx ingress controller seems to be getting around this by adding the capability during the docker build: https://github.com/kubernetes/ingress-nginx/blob/3ffe85537fac9faf9bcdfc845de0d6979c8a3ff6/rootfs/Dockerfile#L53
I don't think that adding a PSP is the right way to go here
The only real argument I see for this is that multiple matching PSPs have undefined results. Is there something I'm missing in my skimming of the issue?
I think that there are effectively two main environments to consider:
- PSP not enabled (e.g. vanilla GKE), in which case it's harmless.
- PSP enabled (e.g. GKE with this), in which case I have to hand-roll a PSP because one wasn't provided.
I know you generally expect folks to customize the Contour configurations (perhaps heavily), but it seems like the best "example" you could provide would work in both of the above environments out of the box.
The problem is, because of the multiple matching problem, if folks already have PSPs enabled, and we add more, the install may or may not work, based on what they've done with their PSPs.
I can see an argument for having an example PSP optionally available (maybe in a separate examples/ directory), but I don't want to put it in the default install.
Additionally, PSPs are being deprecated in favour of OPA/Gatekeeper integrations anyway, as far as I heard from the sig-auth update in San Diego, anyway.
@youngnick Hmm, they are still listed as Beta in 1.18 🤔
If you have any more info on the deprecation, I'd love to know more before I spend a bunch of cycles plumbing them places in Knative 😅
@mattmoor If you have following PSP privileged installed in your cluster:
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: privileged
annotations:
seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*'
spec:
privileged: true
allowPrivilegeEscalation: true
allowedCapabilities:
- '*'
volumes:
- '*'
hostNetwork: true
hostPorts:
- min: 0
max: 65535
hostIPC: true
hostPID: true
runAsUser:
rule: 'RunAsAny'
seLinux:
rule: 'RunAsAny'
supplementalGroups:
rule: 'RunAsAny'
fsGroup:
rule: 'RunAsAny'
I did following custom changes to contour configs and things worked for me:
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: privileged-psp
rules:
- apiGroups: ['policy']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames:
- privileged
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: envoy
namespace: projectcontour
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: envoy-privileged-psp
namespace: projectcontour
roleRef:
kind: ClusterRole
name: privileged-psp
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: envoy
namespace: projectcontour
---
# NOTE: Add above "envoy" service account to the "envoy" Daemonset in the field `spec.template.spec.serviceAccountName`
For the contour operator any restricted PSP would be ok. Because it just talks to apiserver and does not need any more privileges on the node.
Oh, I got PSPs working with Contour, I just felt like they might be useful as part of the "example" configuration that's shipped.
Why can't we have both? I think we should add an example (along with docs) on how Contour implements. We can keep the quick start clean since its goal is to get up and running quickly in a simple environment but have other examples of how to implement PSP's if a user requires that type of deployment.
We can keep the quick start clean since its goal is to get up and running quickly
So from what I've seen PSPs are benign until you enable them, so I'm not sure why they'd compete with the "quickstart" goal, but I may be missing something.
Ahh ok I see your point @mattmoor. I was just going off of what @youngnick suggested. I don't have a ton of experience with PSPs.
I was pointed at this: https://www.youtube.com/watch?v=SFtHRmPuhEw&feature=youtu.be&t=920
So, turns out the recommendation from sig-auth here is to supply PSPs for now, and @mattmoor is right, they should be benign in clusters that don't have it.
So, we will need to add something like this for Envoy and the possibly the certgen job.
The envoy work is blocked on #2374 then - adding a serviceaccount to the Envoy example deployment. Once that's done, we can reference that serviceaccount in the example and bind the new PSP to it.
As an update to what @youngnick mentioned few years back https://github.com/projectcontour/contour/issues/1896#issuecomment-558387020 the PodSecurityPolicies were deprecated and are scheduled to be removed in Kubernetes v1.25.
The replacement Pod Security Admission (beta currently) works by applying labels on the namespace to restrict the maximum level of privileges that pods can get in that namespace. These levels are pre-defined by Pod Security Standards. In theory, we could apply these in the example manifest for the projectcontour namespace but on the other hand I think enforcing a policy is something that the cluster administrator is responsible for, and the responsibility of the application (developers) is to document and use the minimum required privileges. It makes less sense for the application to enforce a policy to itself. That would imply "the fox is guarding the hen house" situation :).
Agreed and thanks @tsaarni. I think that we should ensure that our example deployments meet the Baseline security standard at a minimum, and ideally they should work in a Restricted namespace as well. I don't know if that's achievable with Envoy though.
The Contour project currently lacks enough contributors to adequately respond to all Issues.
This bot triages Issues according to the following rules:
- After 60d of inactivity, lifecycle/stale is applied
- After 30d of inactivity since lifecycle/stale was applied, the Issue is closed
You can:
- Mark this Issue as fresh by commenting
- Close this Issue
- Offer to help out with triage
Please send feedback to the #contour channel in the Kubernetes Slack
The Contour project currently lacks enough contributors to adequately respond to all Issues.
This bot triages Issues according to the following rules:
- After 60d of inactivity, lifecycle/stale is applied
- After 30d of inactivity since lifecycle/stale was applied, the Issue is closed
You can:
- Mark this Issue as fresh by commenting
- Close this Issue
- Offer to help out with triage
Please send feedback to the #contour channel in the Kubernetes Slack