gloo
gloo copied to clipboard
Changes needed in the gloo-ee / gloo helm charts for 1.25 compatibility with a namespace using restricted Pod Security Standards (PSS)
Gloo Edge Product
Open Source
Gloo Edge Version
latest
Kubernetes Version
1.25
Describe the bug
Summary:
Issues when deploying Gloo Edge on 1.25 with a restricted
Pod Security Standard (PSS) profile
- gloo-ee/charts/gloo/templates/19-gloo-mtls-certgen-job.yaml container: certgen does now allow setting a complete podSecurityContext ( PSC ).
- gloo-ee/charts/gloo/templates/3-discovery-deployment.yaml containers have a hardcoded PSC, missing seccompProfile / ability to override it.
- gloo-ee/charts/gloo/templates/5-resource-cleanup-job.yaml container kubectl has a hardcoded PSC without SeccompProfile / Drop of capabilities.
- gloo-ee/charts/gloo/templates/5-resource-migration-job.yaml same as number 3.
- gloo-ee/charts/gloo/templates/5-resource-rollout-job.yaml same as 3 and 4.
- gloo-ee/charts/gloo/templates/6.5-gateway-certgen-job.yaml same as 3 and 4.
- gloo-ee/templates/70-resource-rollout-job.yaml same as 3 / 4.
- gloo-ee/templates/_helpers.tpl gloo.extauthinitcontainers template does not allow setting a PSC.
- Several helm-hooks do not set resource request/limits.
IMHO, a lot of this changes are for "single-shot" pods, adding a default PSC that matches a restricted namespace, the only exception is the _template helper.
Expected Behavior
Gloo Edge OSS and Gloo Edge Enterprise should be able to be deployed in Kubernetes 1.25 with the standards set forth by the restricted PSS profile
Steps to reproduce the bug
deploy latest gloo edge on 1.25 in a cluster set up with restricted
PSS profile
Additional Environment Detail
No response
Additional Context
Additional Context: link to PSS doc
Related Issues
- [x] https://github.com/solo-io/gloo/issues/8455
┆Issue is synchronized with this Asana task by Unito
Note: gloo-ee/templates/70-resource-rollout-job.yaml1
was removed in https://github.com/solo-io/solo-projects/pull/5491/files
@ably77 - question on "9 - Several helm-hooks do not set resource request/limits":
I don't see anything about resource/request limits in the Pod Security Standards. Is this specifically needed for meeting PSS/deploying with a restricted
profile, or is this more generally part of requested helm updates?
OSS changes have entered PR.
In addition to adding support for configuring the individual container securityContexts
, I have added a flag global.podSecurityStandards.container.enableRestrictedContainerDefaults
that will default all container securityContexts to the following securityContext which applies the minimal changes needed to meet the Restricted Pod Security Standards:
securityContext:
allowPrivilegeEscalation: false
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
capabilities:
drop:
- ALL
Template specific defaults will be applied to this context.
@ably77 - question on "9 - Several helm-hooks do not set resource request/limits":
I don't see anything about resource/request limits in the Pod Security Standards. Is this specifically needed for meeting PSS/deploying with a
restricted
profile, or is this more generally part of requested helm updates?
Hey @sheidkamp sorry I missed this. I dont think its a hard requirement that is strictly enforced but is generally a recommended best practice for most organizations to be configurable so more of the "generally part of requested helm updates"
Generally I think we'll see a tool like OPA, Kyverno, or an admission controller that will block a Pod without defined resources from being deployed
@sheidkamp : great that this got fixed! Is this also covering extauth (this is not visible in the PR)? See https://github.com/solo-io/gloo/issues/8455#issuecomment-1631888657
@ably77 - extauth will be covered in the EE PR that relies on the OSS PR.
For resources limits, that's needed at the container level, basically the same scope as the security contexts?
Resource limits also seem dangerous to enforce given that most of these commands are highly dependant on a customers environment. @ably77 can you move that part to a separate RFE as its not cut and dry as well as potentially being a dangerous update
I dont think we need to strictly set a request limit by default, but allow it to be configurable for a user that wants to
We will consider this. Although everything can already technically be overidden by kustomize we can check in to see if there is a cleaner update
@ably77 - looking for some additional clarifications, I see we set the resources in the 5-
/6.5-
/19-
jobs (for example with gateway.cleanupJob.resources
).
Can you give examples (or a full list) of the hooks that need this configuration?
The container security changes have been merged into EE/solo-projects
main (will be part of the 1.17.0-beta3
release) and the 1.16.x branch (will be part of the 1.16.10
release)
As requested in https://github.com/solo-io/gloo/issues/8864#issuecomment-2117726015, please open another RFE for the resource limits, ideally with clarifications requested in https://github.com/solo-io/gloo/issues/8864#issuecomment-2118323654