runtime
runtime copied to clipboard
Feature request: Simplify pods security hardenings
Because of history and unwillingness to implement needed breaking change Docker and Kubernetes still defaults to root
user and not to mention other security hardening which with most of the modern application would be able to work just fine.
Based on my experience most of the applications works just fine with security context like this:
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsGroup: 65534
runAsNonRoot: true
runAsUser: 65534
seccompProfile:
type: RuntimeDefault
which with they can run on namespace where restricted
Pod Security Policy is enforced.
That why it would be nice to have just one setting which can be enabled for Acorn app which will add that security context to deployment and pod-security.kubernetes.io/enforce: restricted
label to namespace which it creates.
In additionally I have noticed many apps also works even with readOnlyRootFilesystem: true
as long tmpfs is mounted to /tmp
meaning something like this on deployment yaml
volumeMounts:
- name: tmpfs
mountPath: /tmp
volumes:
- name: tmpfs
emptyDir:
sizeLimit: 100Mi
and because it is recommended in NSA: Kubernetes Hardening Guide it would be nice to have setting for it too.
As it stands right now all apps are ran in namespaces that have the baseline profile set (pod-security.kubernetes.io/enforce=baseline). If you wish to switch to the restricted profile do acorn install --pod-security-enforce-profile=restricted
. Right now the profile is a global setting. We are open to make that more configurable but it requires careful design to ensure it's done securely per namespace, app, etc. So right now we leave it as a global administrator level setting.
Going beyond the standard profiles is a bit difficult. We don't want to limit the amount of users that can use this so doing anything more than baseline profile by default will have a negative effect. What I do think is possible is to do some framework that allows one to configure basically a pod template that will be applied.
If you wish to switch to the restricted profile do
acorn install --pod-security-enforce-profile=restricted
. Right now the profile is a global setting.
That is already nice feature. However with that profile Acorn should switch to non-root and drop ALL capabilities as those are things which were Kubernetes defaults does not match with restricted policy.
I mean example docker run -it -u nobody:nogroup --cap-drop=ALL mcr.microsoft.com/dotnet/samples:aspnetapp
runs just fine but
containers: {
web: {
image: "mcr.microsoft.com/dotnet/samples:aspnetapp"
ports: publish: "80/http"
}
}
does not because that image defaults to root and even Microsoft is not willing to change that default (yes, I tried on https://github.com/dotnet/dotnet-docker/pull/3139 )
@olljanat Docker allows non-root to bind to <1024 ports. I'm not sure k8s does that by default, I have to look into it.
k8s does not but k3s and rke2 does starting on v1.24.2 as I managed to get it included https://github.com/k3s-io/k3s/pull/5538
However ASP.NET can be also instruct to change port with environment variable like ASPNETCORE_URLS=http://+:8080
If my proposal gets approved then containerd 2.0 should enable it for k8s too https://github.com/containerd/containerd/issues/6924
need to discuss with @ibuildthecloud to see what we want to do here
I just thought I'd add, I can't get anything to actually run with the restricted profile. The Deployments generated by acorn do not meet the requirements so you get this kind of thing in the replicaset events and no Pods:
Warning FailedCreate 40s replicaset-controller Error creating: pods "web-5759d85644-6pnr2" is forbidden: violates PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "web" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "web" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "web" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "web" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
Warning FailedCreate 1s (x5 over 39s) replicaset-controller (combined from similar events): Error creating: pods "web-5759d85644-mqjwg" is forbidden: violates PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "web" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "web" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "web" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "web" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
I'm not sure if there is a trick I'm missing here to change some config in my Acornfile to make it comply...