awx-operator
awx-operator copied to clipboard
AWXBackup: mkdir: cannot create directory '/backups/tower-openshift-backup-2024-04-17-141257': Permission denied
Please confirm the following
- [X] I agree to follow this project's code of conduct.
- [X] I have checked the current issues for duplicates.
- [X] I understand that the AWX Operator is open source software provided for free and that I might not receive a timely response.
Bug Summary
Upgraded awx operator from 1.1.4 to 2.13.1 and started to get issues when trying to take backups. Here`s an example of the AWXBackup I have:
apiVersion: awx.ansible.com/v1beta1 kind: AWXBackup metadata: name: awx-demo namespace: awx-test spec: deployment_name: awx-demo backup_pvc: 'backup-pvc' no_log: false
Once applied, operator tries to create a folder for the backup on the db-management pod. However, its getting the issue permission denied
[backup : Set backup directory name] **************************************\r\ntask path: /opt/ansible/roles/backup/tasks/postgres.yml:55\nok: [localhost] => {"ansible_facts": {"backup_dir": "/backups/tower-openshift-backup-2024-04-17-141257"}, "changed": false}\n\r\nTASK [backup : Create directory for backup] ************************************\r\ntask path: /opt/ansible/roles/backup/tasks/postgres.yml:59\nansible.cfg.\nfatal: [localhost]: FAILED! => {"changed": true, "rc": 1, "return_code": 1, "stderr": "mkdir: cannot create directory '/backups/tower-openshift-backup-2024-04-17-141257': Permission denied\n", "stderr_lines": ["mkdir: cannot create directory '/backups/tower-openshift-backup-2024-04-17-141257': Permission denied"]
AWX Operator version
2.13.1
AWX version
24.0.0
Kubernetes platform
kubernetes
Kubernetes/Platform version
microk8s v1.28.8
Modifications
no
Steps to reproduce
Fresh installation and trying to create a backup using AWXBackup CR.
Expected results
Take the backup successfully
Actual results
Failed backup
Additional information
No response
Operator Logs
No response
Hello @PWeverton, can you read through this issue and see if it applies to your case https://github.com/ansible/awx-operator/issues/1775? The new postgres image is expecting to write to your dir as uid-26. There are some workarounds discussed to address the change.
Hello @jessicamack, thanks for replying. Well, the operator is having issues when trying to create the dir on db-management pod, so I don't think the issue you marked is related to it. However, I just reproduced the action items suggested there. Here's what I did:
- Used operator 2.15.0, as that feature was added on this tag.
- Then adjusted my AWX CR:
apiVersion: awx.ansible.com/v1beta1 kind: AWX metadata: name: awx-app spec: no_log: false service_type: nodeport postgres_data_volume_init: true postgres_init_container_commands: | chown 26:0 /var/lib/pgsql/data chmod 700 /var/lib/pgsql/data
Even after this change, the issue with the permissions still there.
This issue is not addressed by #1805 (postgres_data_volume_init
and postgres_init_container_commands
) since this issue is in following situation:
- Occurs in ephemeral
*-db-management
pod instead of main PSQL pod - Occurs in backup pvc instead of main PSQL pvc
- No init container for
*-db-management
pod is implemented in the current AWX Operator - From 2.13.0, the image for
*-db-management
pod has also been changed to sclorg's one
So we should implement init container for *-db-management
pod and have a flag to modify owners/perms for backup pvc, or have a flag to run *-db-management
pod as UID:0.
@rooftopcellist F.Y.I.
hi @kurokobo, any movement here? Thanks
I just made PR #1854 , I'm able to take successful backups now if I run that init container once per PVC
Please add this chang to the next Release since awxbackup also cannot create directory in my deployment because of permission issues. Changing the permissions on the NFS server to User ID 26 solved it but this is an manuall configuration step das workarround.
May it helps someone, i workaround this problem by creating a cronjob which crates my backup and added an initcontainer which sets the permissions to 26:26 on the backup folder.
I hit this issue after upgrading to 2.15.0. As per @pombaer first suggestion, I added another NFS mount and set the owner UID and GID to 26, then created a new PV/PVC and pushed the backup to that. For anyone using AWS EFS, you need to create an access point with the correct uid and gid and mount with that for it to work properly.
Did this issue get resolved? I have the same issue running 2.19.1
Interestingly if I change the Postgres in my awxbackup.yml to
_postgres_image: docker.io/postgres
_postgres_image_version: 15-alpine
The issue goes away for the "mkdir: cannot create directory '/backups': Permission denied" and I can take a successful backup. However this just shifts my issue to a restore problem of.
"pg_restore: error: unsupported version (1.15) in file header"
So I went back and modified the permissions to 26 on the /backups and it works, but my hack was dirty so wondering the correct way this will be done.
# _postgres_image: docker.io/postgres
# _postgres_image_version: 15-alpine
running k3s. Client Version: v1.29.4+k3s1 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.29.4+k3s1
The PR still open and seems it will take a while to be merged. As a workaround, you can clone the repo and modify the way this is handled. You can use one of these 2 options:
roles/backup/templates/management-pod.yml.j2
1 - Add an init container
initContainers: - name: init-pvc-chown image: busybox # _postgres_image runs as uid 26 command: ["sh", "-c", "chown -R :26 /backups && chmod -R 770 /backups"] volumeMounts: - name: {{ ansible_operator_meta.name }}-backup mountPath: /backups readOnly: false
2 - Run the container with privileged user
containers:
- name: {{ ansible_operator_meta.name }}-db-management image: "{{ _postgres_image }}" imagePullPolicy: "{{ image_pull_policy }}" command: ["sleep", "infinity"] securityContext: runAsUser: 0 privileged: true
Once you have it in place, just build the image and set the image url on your operator deployment.