awx-operator
awx-operator copied to clipboard
Assign ownership of backup pvc to postgres image's uid
SUMMARY
Backups are failing because the postgres image runs as uid 26 that doesn't have perms to the backup PVC. This fixes #1830
ISSUE TYPE
- Bug, Docs Fix or other nominal change
ADDITIONAL INFORMATION
The other option is to have the db-management pod run as root (uid 0), but I dont think thats the preferred solution.
@ranvit Hi, thanks for working on this!
Note that AFAIK changing owner and permission for PVC is NOT required for all users, but only required for specific users e.g. on hostPath or longhorn on k8s. So adding init container with root privilege by default may cause side effects for those who are on the k8s or OpenShift where the issue does not exist.
There is already similar implementation for PVC for PSQL to change owner and permission, so I think it would be better to follow existing implementation instead of going different way. How do you think? The exising implementation is having a flag to enable init container to invoke specific commands. Refer to: https://github.com/ansible/awx-operator/pull/1805
As @kurokobo mentioned, we should make this init container optional, following the same pattern we did in #1805
also we should use postgres image instead of busybox, try to use same variable names as other postgres init implementation, etc..
Jumping on this because of a migration test using AWXBackup and AWXRestore for the first time migrating from pg 13 to 15 from Operator 2.11.0 to 2.18.0 and AWX from 23.7.0 to 24.5.0.
Migration itself is not working (too big for a leap?) so once the migration is done from 13 to 15, I end up with a empty AWX deploy hence the effort on the backup/restore through the operator, but maybe this is the exact cause of the problem that causing the migration to not go well:
TASK [Get AWX object definition from pvc] ******************************** fatal: [localhost]: FAILED! => {"changed": false, "msg": "Failed to copy file from Pod: cat: /backups/tower-openshift-backup-2024-06-24-204806/awx_object: Permission denied\n"}
https://github.com/ansible/awx-operator/blob/devel/roles/restore/tasks/import_vars.yml#L11
Running the deploy from another AWX instance, so not so easy to just add the workaround mentioned above. At least I'm not familiar with how I can change the permissions needed.
Hello @ranvit : are you still working on this PR?