AWX fails to deploy on OpenShift when using project persistence
ISSUE TYPE
- Bug Report
SUMMARY
When setting spec.projects_persistence: true the operator fails to deploy AWX on OpenShift.
ENVIRONMENT
- AWX version: 19.4.0
- Operator version: 0.14.0
- Kubernetes version: 1.21
- AWX install method: OKD 4.8
STEPS TO REPRODUCE
- Install AWX Operator on OpenShift
- Apply configuration using
spec.projects_persistence: true
EXPECTED RESULTS
AWX deploys and is available.
ACTUAL RESULTS
AWX fails to deploy. The ReplicaSet gives the following error message: "Error creating: pods "awx-65c446586f-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider restricted: .spec.securityContext.fsGroup: Invalid value: []int64{1000}: 1000 is not an allowed group, provider "nonroot": Forbidden: not usable by user or servic eaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]"
ADDITIONAL INFORMATION
awx-config.yaml (with minor redactions):
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
name: awx
namespace: awx
spec:
service_type: nodeport
ingress_type: route
hostname: awx.apps.dev.example.com
postgres_image: bitnami/postgresql
redis_image: bitnami/redis
postgres_storage_class: ovirt-csi-sc
ldap_cacert_secret: idm-ca
bundle_cacert_secret: idm-ca
projects_persistence: true
projects_storage_class: ovirt-csi-sc-hdd
---
apiVersion: v1
kind: Secret
metadata:
name: idm-ca
namespace: awx
data:
bundle-ca.crt: |
[redacted]
ldap-ca.crt: |
[redacted]
AWX-OPERATOR LOGS
None at the moment, can provide later if required.
We think this is the problem line https://github.com/ansible/awx-operator/blob/d1d6785b7dc704fe6e0093eef680d4a849b20f90/roles/installer/templates/deployment.yaml.j2#L316 but don't know what we need to do to resolve it.
You need to define scc for the application.
@mrcetinel As I understand it, OpenShift will create a group and user when necessary. Is there a reason the fsGroup needs to be 1000? There's more than one case I've encountered where simply removing that line fixes the issue. I did try giving both the default and awx users privileged SCC roles but that didn't help either.
As far as what can be done to fix it, would adding a 'is-openshift" flag to the config be appropriate to disable this? Or possibly a securityGroup override?
@gjsmo The one of the easiest way to define scc anyiud for default service account. The privileged scc will not help you. Also it is not possible to define custom service account this is why we need to define anyuid scc to default service account.
I think that will solve your problem.
But if you will set persistence to true and use NFS you will hit this issue. #532 I opened this issue but there is not any solution provided until now. So we are not using persistent volume anymore.
On the otherside it is not a requirement to use persistent volume for AWX. The only advantage that will speed up the process of pulling the repositories from Git server.
$ oc adm policy add-scc-to-user anyuid -z default
- Please keep on your mind. This is not advised by RH but I do not see any side affects. You can restrict the project/namespace for experienced colleagues.
@mrcetinel That doesn't seem to work unfortunately, maybe I'm missing something. I add the SCC, update the AWX resource, and the operator creates the PVC but still cannot create the pod. The error is the same.
According to the docs on SCCs privileged is the most relaxed SCC and should allow running with any UID/GID. I'm still curious about if the fsGroup is required at all - OpenShift should provision a UID/GID for the application with no specification needed.
@gjsmo Could you please give it a try for below procedure ? It is working like a charm on OKD.
$ oc new-project awx
$ oc apply -f https://raw.githubusercontent.com/ansible/awx-operator/0.12.0/deploy/awx-operator.yaml
$ oc create sa awx
$ oc adm policy add-scc-to-user privileged -z awx
$ oc adm policy add-cluster-role-to-user cluster-admin -z awx-operator
FYI, Most people are unable to add scc contexts to their clusters due to security restrictions. Can we instead change to use a uid > 1000?
@rooftopcellist It doesn't look like the commit that referenced this issue ever got merged into the main fork. When you have a chance, it'd be good to finish this up.
@rooftopcellist was fix merged ? I am facing same issue.
@gjsmo Could you please give it a try for below procedure ? It is working like a charm on OKD.
$ oc new-project awx$ oc apply -f https://raw.githubusercontent.com/ansible/awx-operator/0.12.0/deploy/awx-operator.yaml$ oc create sa awx$ oc adm policy add-scc-to-user privileged -z awx$ oc adm policy add-cluster-role-to-user cluster-admin -z awx-operator
I am installing 0.15.0 version of AWX operator and had the same issue. As a workaround did the next:
$ oc get sa
NAME SECRETS AGE
awx 2 13m
awx-operator-controller-manager 2 4d14h
builder 2 6d4h
default 2 6d4h
deployer 2 6d4h
# We are going to adjust `awx` service account name
$ oc adm policy add-scc-to-user privileged --serviceaccount=awx
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:privileged added: "awx"
FYI, Most people are unable to add scc contexts to their clusters due to security restrictions. Can we instead change to use a uid > 1000?
I am not sure if it is possible. To be honest I tried it but it did not work about 3-4 months ago, need to test it again.
@zentavr Thank you for your feedback
@mrcetinel Sorry for waking up long gone issue :) Do you still use same methods despite updates since 2022 ?