kubernetes-ingress
kubernetes-ingress copied to clipboard
Apply UID/GID defaults from image
Proposed changes
build/Dockerfile
specifies USER 101
for common
target, which is re-applied into the final images. Helm Chart/Manifests do not need to specify UID explicitly, and can instead use the image's UID. (PodSecurityContext v1 core specifies runAsUser
defaults to user specified in image metadata if unspecified.)
The existing runAsNonRoot: true
flag (already in place) will ensure during runtime that the image is configured with a custom user ID.
This is notably helpful for users running OpenShift, because OpenShift attempts to enforce custom UID/GID ranges for individual namespaces as part of restricted-v2
Security Context Constraint. When removing hard-coded values from manifests, OpenShift will be able to assign its own UID/GID.
In practice, this means a different model of configuring file system permissions. OpenShift assigns the container process GID 0 as supplemental to assist with that. Locations that are expected to be written to must be owned by GID 0, with group write permissions. Previous changes to main
have ensured that is the case.
Init container copying files is not a concern, as we will have the same UID as owner there as the main NIC container.
Reference: A Guide to OpenShift and UIDs
Closes #5422.
Checklist
Before creating a PR, run through this checklist and mark each as complete.
- [x] I have read the CONTRIBUTING doc
- [ ] I have added tests that prove my fix is effective or that my feature works
- [x] I have checked that all unit tests pass after adding my changes
- [ ] I have updated necessary documentation
- [x] I have rebased my branch onto main
- [x] I will ensure my PR is targeting the main branch and pulling from my branch from my own fork
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 52.35%. Comparing base (
8a2c9a9
) to head (f38706e
). Report is 257 commits behind head on main.
:exclamation: Current head f38706e differs from pull request most recent head ee75176. Consider uploading reports for the commit ee75176 to get more accurate results
Additional details and impacted files
@@ Coverage Diff @@
## main #3665 +/- ##
==========================================
+ Coverage 52.32% 52.35% +0.03%
==========================================
Files 61 59 -2
Lines 17502 16880 -622
==========================================
- Hits 9158 8838 -320
+ Misses 8015 7747 -268
+ Partials 329 295 -34
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Not sure if CI / Smoke Tests (alpine, policies, 1.26.2) failure of internally observed 502 Bad Gateway response is relevant to this PR. 🤔
~Will double check if I can repro.~ Tests being re-run helped. I do not see how this PR can cause a transient test failure.
Hey hey. Sorry, been busy with a few initiatives having tight delivery deadlines.
I did not test this out on OpenShift. Opened the PR based on a community member feedback so it doesn't get lost, and myself checked that the K8s perspective is functional.
OpenShift's blog/guidance covers the ownership/access concern:
The Container user is always a member of the root group, so it can read or write files accessible by GID=0. Any command invoked by the Entrypoint will be executed with this unprivileged UID and GID pair. That means, it is an unprivileged user executing the commands and the UID that will be used during execution is not known in advance. From the technical design perspective, that means, directories and files that may be written to by processes in the Container should be owned by the root group and be read/writable by GID=0. Files to be executed should also have group execute permissions.
Will mark this PR as a draft - need to double-check on the concern raised.
Sorry for the lack of reply on this. The open initiatives are getting wrapped up so I have some time to look into this again.
Did you test this in OpenShift? If it runs as a random UID and we set all the permissions for the files to the nginx user, will it cause problems?
I personally operate Kubernetes clusters instead of OpenShift clusters. I don't feel like reading the licensing model by RedHat or registering for a 2-month developer trial, so I will spin up a basic Pay-as-You-Go OpenShift cluster in Azure instead. Microsoft explicitly requires Dsv3 (3rd generation D-series) VMs for OpenShift master nodes, so I just requested a quota change on my Azure account. Should be able to deploy soon and report how the file operations behave.
Shouldn't this be enforced in the SCC? Can't we just set 101 there? See https://github.com/nginxinc/nginx-ingress-helm-operator/blob/main/resources/scc.yaml#L7-L9
The linked issue requests a restricted-v2
SCC to be supported out-of-the-box. In particular, the issue is that the SCC "Requires that a pod is run as a user in a pre-allocated range of UIDs". (Reference: RedHat: Managing security context constraints)
I think that means a runAsUser.type: MustRunAsNonRoot
preferably, so that no default UID is specified in the SCC.
Will double-check this during testing - a good time for me to learn a bit more about the OpenShift policies here.
Fun fact. When deploying an OpenShift cluster in Azure, there are actually three kinds of VMs that get deployed for you:
- One bootstrap machine, which is removed after installation
- Three control plane machines
- Three compute machines
The three control plane machines can be D8s_v3, resulting in 24 cores used. But it appears the bootstrap machine also uses the D8s_v3 size, so the actual minimum quota requirement there is 32 cores (excluding worker/compute nodes). I am waiting for a further quota increase.
Initial testing looks strange against OpenShift 4.10 which is what Azure deploys by default.
I am not seeing issues inside of Pods, which could be expected if UID is used that cannot access files. Instead what I see is that Pods themselves do not get created in the set. Should be able to dedicate more focus time for this tomorrow, or will just sit down with it over the weekend. 🤔
So, a quick sanity check that it works on local K8s without fancy add-ons:
# install microk8s locally
sudo snap install microk8s --classic
microk8s status --wait-ready
# or reset an existing microk8s installation
sudo microk8s reset
# enable basic services on microk8s
microk8s enable dashboard dns registry istio
microk8s helm install nginx-ingress deployments/helm-chart \
--namespace nginx-ingress --create-namespace \
--set controller.image.repository=ghcr.io/nginxinc/kubernetes-ingress \
--set controller.image.tag=edge \
--set controller.service.type=NodePort
microk8s kubectl -n nginx-ingress get pods
# NAME READY STATUS RESTARTS AGE
# nginx-ingress-controller-6bd8954f55-rxsc9 1/1 Running 0 60s
microk8s kubectl -n nginx-ingress exec -it nginx-ingress-controller-6bd8954f55-rxsc9 -- /usr/bin/id -u
# 101
Currently dropping a few findings and notes. OpenShift has interesting security features built in. To summarize, it wants to run things more restricted than "native Kubernetes".
To be honest, I am a little annoyed by RedHat and/or Microsoft, as I was working based off expectations from latest documentation. When deploying installer-provided infrastructure in Microsoft Azure you get Openshift v4.10 (current is v4.12). Why this matters is that v4.11 introduces the restricted-v2
security context constraints. When running v4.10, you only have the older restricted
security context constraints, which does not seem to allow NET_BIND_SERVICE
capability to be added explicitly -- that's CAPS column below, listing <no value>
instead of ["NET_BIND_SERVICE"]
.
oc get scc privileged restricted restricted-v2
# NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP PRIORITY READONLYROOTFS VOLUMES
# privileged true ["*"] RunAsAny RunAsAny RunAsAny RunAsAny <no value> false ["*"]
# restricted false <no value> MustRunAs MustRunAsRange MustRunAs RunAsAny <no value> false ["configMap","downwardAPI","emptyDir","persistentVolumeClaim","projected","secret"]
# Error from server (NotFound): securitycontextconstraints.security.openshift.io "restricted-v2" not found
The error event observed for ReplicaSet (not able to deploy Pod) was initially not clear to me, when trying to deploy the Helm chart. But really the "providers" are just "security context constraints" it tries to evaluate.
pods "nginx-ingress-controller-deadbeef00-" is forbidden: unable to validate against any security context constraint: [
provider "anyuid": Forbidden: not usable by user or serviceaccount, spec.containers[0].securityContext.capabilities.add: Invalid value: "NET_BIND_SERVICE": capability may not be added,
provider "nonroot": Forbidden: not usable by user or serviceaccount,
provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount,
provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount,
provider "hostnetwork": Forbidden: not usable by user or serviceaccount,
provider "hostaccess": Forbidden: not usable by user or serviceaccount,
provider "node-exporter": Forbidden: not usable by user or serviceaccount,
provider "privileged": Forbidden: not usable by user or serviceaccount,
provider "privileged-genevalogging": Forbidden: not usable by user or serviceaccount]
And Forbidden: not usable by user or serviceaccount
can be resolved by adding RoleBinding for the security context constraint to the nginx-ingress
ServiceAccount. I am not sure why restricted
SCC is not being automatically applied/why there is no failure information visible to me for restricted
SCC in logs.
oc new-project nginx-ingress --display-name='Nginx Ingress Controller'
oc -n nginx-ingress adm policy add-scc-to-user restricted -z nginx-ingress
oc -n nginx-ingress adm policy add-scc-to-user privileged -z nginx-ingress
helm install nginx-ingress deployments/helm-chart \
--namespace nginx-ingress \
--set controller.image.repository=ghcr.io/nginxinc/kubernetes-ingress \
--set controller.image.tag=edge
kubectl -n nginx-ingress exec -it nginx-ingress-controller-689d96577b-cj2kv -- /usr/bin/id -u
# 101
privileged
SCC is as open as it gets, and that seems to work.
I want to poke around a bit more -- need to manually configure a restricted
SCC variant that allows NET_BIND_SERVICE
, to see how the additional security safeguards change the UID behavior. Should probably try an upgrade to v4.11 as well, as I would like to see the built-in restricted-v2
in action.
~I need to check on the difference that edge-ubi
brings with itself, as the Building the Ingress Controller Image page says its for OpenShift clusters. That's nginxcontrib/nginx:1.23.4-ubi
as base image, instead of default nginx:1.23.4
.~
Looking at Characteristics of UBI images, the UBI (Universal Base Images) are a standardized base image offering by RedHat. You end up with the base layers cached in a predictable manner when using various applications. They provide their pre-approved runtimes there and other minor tweaks.
Will re-test later with the UBI images, but I expect them to functionally behave the same.
Red Hat OpenShift Service on AWS (ROSA) needs some manual configuration to authorize access for the RedHat Cloud Console to your AWS account. But after the linking is done, you have a smooth deployment process from RedHat Console. This allows picking specific versions from dropdown, so the deployed cluster is already upgraded for you. Instead of having to rely on the pre-selected version as in Microsoft Azure, and then selecting an upgrade. Small things, but they can matter when you are messing around with cluster deployments.
About OpenShift versioning: v4.10 (released March 2022) is in Maintenance Support until September 2023. This is the version that does not have restricted-v2
security context constraints available. (Docs: Red Hat OpenShift Container Platform Life Cycle Policy)
I was not able to configure an SCC based on restricted
that gets picked up in the nginx-ingress
namespace. Something is causing the built-in restricted
SCC to be ignored, so if I base my new template on it then it gets ignored as well. (Or maybe I am missing something else.)
In any case, running NIC as privileged
SCC seems to be functional, just as in plain K8s.
oc new-project nginx-ingress --display-name='Nginx Ingress Controller'
oc adm policy add-scc-to-user privileged -z nginx-ingress -n nginx-ingress
helm install nginx-ingress deployments/helm-chart \
--namespace nginx-ingress \
--set controller.image.repository=ghcr.io/nginxinc/kubernetes-ingress \
--set controller.image.tag=edge-ubi
kubectl -n nginx-ingress get pods
# NAME READY STATUS RESTARTS AGE
# nginx-ingress-controller-6988bc87d9-fckm5 1/1 Running 0 30s
kubectl -n nginx-ingress exec -it nginx-ingress-controller-6988bc87d9-fckm5 -- /bin/id -u
# 101
I then deleted the nginx-ingress
namespace, and asked RedHat Console to upgrade my cluster to the 4.11 channel.
When starting in 4.11+, you get runAsUser: 1001040000
and accordingly in the Pod:
Failed to write main config: failed to open /etc/nginx/nginx.conf: open /etc/nginx/nginx.conf: permission denied
If Helm chart is applied with controller.readOnlyRootFilesystem=true
value, then pod admission doesn't like fsGroup
being set to 101
explicitly.
So I am currently trying to think how I can patch up the project for GID=0 as well as resolve fsGroup being flexible. Will follow up with further changes.
Removing fsGroup
for readOnlyRootFilesystem
works based on the fact that the emptyDir
gets created with mode 777
.
oc exec -n nginx-ingress -it nginx-ingress-controller-778c5c8fbf-tvxlm -c nginx-ingress -- /bin/bash
bash-5.1$ id -u
# 1001090000
bash-5.1$ id -g
# 0
bash-5.1$ ls -al /var/cache/nginx/
# total 0
# drwxrwsrwx. 7 root 1001090000 98 May 16 20:52 .
# drwxr-xr-x. 1 root root 19 Apr 12 03:52 ..
# drwx--S---. 2 1001090000 1001090000 6 May 16 20:52 client_temp
# drwx--S---. 2 1001090000 1001090000 6 May 16 20:52 fastcgi_temp
# drwx--S---. 2 1001090000 1001090000 6 May 16 20:52 proxy_temp
# drwx--S---. 2 1001090000 1001090000 6 May 16 20:52 scgi_temp
# drwx--S---. 2 1001090000 1001090000 6 May 16 20:52 uwsgi_temp
Will need to be re-tested on plain Kubernetes.
OpenShift SCCs are actually fairly straight forward. They are a programmatic way to ensure a policy, and they hook into the pod admission process. Nothing crazy. Except sometimes event messages do not include reason why restricted
constraint cannot be applied. Go figure.
Overall, ended up being a bit larger of a change than I initially thought. Here is the run-down.
The change removes runAsUser
and fsGroup
so the UID and GID assignments are not being enforced by Helm chart template directly.
UID selection on Kubernetes is inherited from the image metadata (so it retains UID 101 as now), while the GID logic can be left up to defaults (we don't depend on it).
For OpenShift 4.11+, this change means that the default locked down restricted-v2
security context constraints are functional. They can get randomized UID/GID.
Even though UID/GID are randomized, a supplemental GID 0 is applied. That means the NIC process can work with files and directories where group ownership is set to root.
Dockerfile already configures ownership to GID 0, so we adjust permissions to allow group the same rights as owner user.
I know /var/log/nginx
directory was added as a potential write location when implementing read-only root filesystem. The Dockerfile now also changes ownership of this directory, in the same manner as for /etc/nginx
and the other /var
locations.
The directory was previously mode 755
owned by root:root
.
Two files existed in it -- access.log
and error.log
-- with modes 640
owned by nginx:adm
.
The ownerships are changes to nginx:root
now, with mode 775
for directory and 660
for files.
Built an UBI image with make ubi-image TARGET=container
and validated that it deploys to OpenShift 4.11 with:
helm install nginx-ingress deployments/helm-chart \
--namespace nginx-ingress --create-namespace \
--set controller.image.repository=docker.io/sigv/nginx-ingress \
--set controller.image.tag=cbe5594-ubi
I did a quick sanity check that the controller responds with an HTTP 404.
I then applied r/o rootfs with:
helm upgrade nginx-ingress deployments/helm-chart -n nginx-ingress \
--set controller.image.repository=docker.io/sigv/nginx-ingress \
--set controller.image.tag=cbe5594-ubi \
--set controller.readOnlyRootFilesystem=true
..and observed the same HTTP 404.
On local machine,
microk8s helm install nginx-ingress deployments/helm-chart \
--namespace nginx-ingress --create-namespace \
--set controller.image.repository=docker.io/sigv/nginx-ingress \
--set controller.image.tag=cbe5594-debian \
--set controller.service.type=NodePort
..starts up successfully.
I redeployed OpenShift version 4.10. The privileged
SCC appears to function as before:
oc new-project nginx-ingress --display-name='Nginx Ingress Controller'
oc adm policy add-scc-to-user privileged -z nginx-ingress -n nginx-ingress
helm install nginx-ingress deployments/helm-chart \
--namespace nginx-ingress \
--set controller.image.repository=docker.io/sigv/nginx-ingress \
--set controller.image.tag=cbe5594-ubi
~Will run the OSS test suite myself as well, to verify there is no strangeness going on.~ The OSS test suite passes locally.
OBS: If the workflow gets run on Github Actions, then it needs to be run as protected inside of the official repository -- there is a Dockerfile change, which means the cache cannot be reused, and the Plus secrets are needed for the new build.
Executive summary: This change allows deployment to OpenShift (v4.11+) without having to manually configure custom security constraints.
@lucacome, should I maybe split this apart into smaller PRs based on the technical changes to be reviewed independently? Or is this okay as-is, one functional change for OpenShift users?
I split off #3925 and #3926 to deal with clean-up changes, that were initially included in this change but can be reviewed individually. Hopefully this makes review easier on those two tasks.
I will rebase this PR, when they are approved, so that change scope here is halved, and update the description accordingly. 🤞🏻
@lucacome, should I split the Dockerfile changes into separate PRs - for chown 101:0 /var/log/nginx
and for chmod g=u
?
That way this PR only consists of the runAsUser
removal? That part is the tricky part that can break some custom images, where people add USER 0
in the Dockerfile but do not add USER 101
later.
Thinking more about this, we probably don't want to touch /var/log/nginx
as access.log
is symlinked to /dev/stdout
and error.log
is symlinked to /dev/stderr
. But in that case, we should probably drop /var/log/nginx
from the emptyDir
configurations for read-only rootfs.
This stays as WIP. ⏳
I split off the chmod g=u
change for review as #3962 in the meantime.
This was a reply to a comment that has since been removed.
The proposed diff is visible in this Pull Request's Files changed tab.
If you are using the Helm chart, the relevant references are in deployments/helm-chart/templates/
-- depending on whether a DaemonSet or Deployment is used, it's either controller-daemonset.yaml
or controller-deployment.yaml
respectively.
If you are instead using the static manifests, the relevant references are in deployments/daemon-set
or deployments/deployment
depending on installation type, as above. Either nginx-ingress.yaml
or nginx-plus-ingress.yaml
file will be relevant, depending on the service type.
However, do keep in mind that there is a proposed change to build/Dockerfile
pending as well, so you would have to manually build an image that you load into your Openshift cluster. The latest release build does not include these prerequisite changes.
A notable part of the Dockerfile change is pending review as #3962. There is still a concern that if you are currently writing anything to /var/log/nginx
then this scope would not be enough. However for a vanilla configuration, when that scope is merged, the latest edge build should satisfy your requirements.
@andrescolodrero, if you still experience issues, let me know what Openshift version you are running and what exact commands you are using to carry out the installation.
@sigv IM using Openshift 4.12
Installation is done via Operator, basic setup from the UI. Operator version: nginx-ingress-operator.v1.5.0
Yes. The Operator is not yet scoped in for removal of anyuid. This is a slow work-in-progress.
ok thanks @sigv I tried also Helm install with same results, i will check building my own image of ingress.
Yes. The Operator is not yet scoped in for removal of anyuid. This is a slow work-in-progress.
Considering that our Operator is derived from the Helm chart... We need to make sure it works. We were talking about this on our end this morning.
With #4269 merged, I will dust this off (merge conflicts) and look into relevant Operator changes.
There are currently other initiatives that I am working on, but later this month I should be able to work on this change too.
Deploy request for nginx-kubernetes-ingress pending review.
Visit the deploys page to approve it
Name | Link |
---|---|
Latest commit | ee751765b5414d97570dbc12a348826a89fb1a69 |
Not seeing any issues with Operator from my end. Opened nginxinc/nginx-ingress-helm-operator#254 to modify the SCC included in the repository there.
Let me know what testing would be preferable to be done from my side.
Hi @sigv can you create a new issue for this as the one linked in description seems incomplete and WIP, we can then prioritise and review the PR, thanks!
@vepatel, opened a fresh #5422 for this scope. 🤞🏻
This PR is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 10 days.
Hi @sigv, now that securityContext is configurably using helm, is this still required?