awx-operator
awx-operator copied to clipboard
Default AWX Configuration causes postgres invalid password
ISSUE TYPE
- Bug Report
SUMMARY
I am trying to deploy AWX Operator and one instance of AWX but for some reason it keeps having problems with the postgres credentials despite it being a clean and default install given in here.
I applied the awx-operator
manifest with 0.13.0
as <TAG>
and also used below manifest to deploy the AWX and PostgreSQL instance
---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
name: awx-demo
spec:
service_type: clusterip
ingress_type: none
hostname: awx-demo.example.com
ENVIRONMENT
- AWX version: 0.13.0
- Operator version: 0.13.0
- Kubernetes version: v1.20.8-gke.2100
- AWX install method: Kubernetes
STEPS TO REPRODUCE
- Have a kubernetes env running (I was using GKE)
-
kubectl apply -f https://raw.githubusercontent.com/ansible/awx-operator/0.13.0/deploy/awx-operator.yaml
- Apply the AWX instance manifest I referred above
- Log on the PostgreSQL pod and there will be a message saying
2021-09-08 07:38:57.290 UTC [197] FATAL: password authentication failed for user "awx"
2021-09-08 07:38:57.290 UTC [197] DETAIL: Connection matched pg_hba.conf line 99: "host all all all scram-sha-256"
EXPECTED RESULTS
Access the AWX normally through the service (I used kubectl port-forward
)
ACTUAL RESULTS
but instead I got this
ADDITIONAL INFORMATION
AWX-OPERATOR LOGS
Nothing seems to be related to my problem in the logs
Saw Exactly the same on k3d v4.4.8 and docker 20.10.2
Is it possible that you could share the password that the operator is generating here? After you change it, of course. Our thought is that maybe a special character is causing problems.
Had the same issue #631 in GKE. I tried using password with out any special chars but getting same error in logs in GKE. Not sure if it is only related to GKE
Seeing the same issue on Operator v0.17.0, AWX 20.0.0 on k3s.
Using kubectl delete awx $NAME
doesn't remove the PersistentVolumeClaim for the AWX database, as helpfully pointed out in #631, deleting the PVC and applying the AWX manifest again resolved the issue.
Though, of course, this only works if you are happy to lose the contents of the database.
Met the same issue on Operator v0.17.1 on Openshift. Is there any work around to fix the issue? I don't want to lose the current database.
I'm seeing this same issue on Azure Kubernetes Service too with operator 0.16.x just as a "me too".
Same with operator 0.20.0 and 0.21.0. garbage_collect_secrets not works.
Solution for me was explicitly set awx-postgress credentials like it was an external db (see README at https://github.com/ansible/awx-operator/blob/devel/README.md#external-postgresql-service but changing type from unmanaged to managed). It seems tha awx containers don't establish credentials ok in some scenaries.
You can get details for complete secret postgres_configuration_secret data in awx containers at /etc/tower/conf.d/credentials.py
Deploying Ansible Automation Platform 2.3 on RHEL8. Hit this issue with the following error:
django.db.utils.OperationalError: FATAL: password authentication failed for user "awx"
Our password had special characters in it. Namely the character '&'. Changed the password to one without that character and ran through flawlessly.
We’ll report it to RedHat
Same with operator 0.20.0 and 0.21.0. garbage_collect_secrets not works.
Solution for me was explicitly set awx-postgress credentials likeit was an external db (see README at https://github.com/ansible/awx-operator/blob/devel/README.md#external-postgresql-service ; note type put tu managed). It seems tha awx containers don't establish it ok in somo escenaries.
You can get details for complete secret postgres_configuration_secret data in awx containers at /etc/tower/conf.d/credentials.py
Hi, would you might sharing your script for postgresql configuration? We hit the same issue and deleting database is not an option! Also is there a typo in "note type put tu managed"?
Same with operator 0.20.0 and 0.21.0. garbage_collect_secrets not works. Solution for me was explicitly set awx-postgress credentials likeit was an external db (see README at https://github.com/ansible/awx-operator/blob/devel/README.md#external-postgresql-service ; note type put tu managed). It seems tha awx containers don't establish it ok in somo escenaries. You can get details for complete secret postgres_configuration_secret data in awx containers at /etc/tower/conf.d/credentials.py
Hi, would you might sharing your script for postgresql configuration? We hit the same issue and deleting database is not an option! Also is there a typo in "note type put tu managed"?
Sorry, I have corrected my comment. I don't remember the exact steps I followed. But in essence config internal db like external db: https://github.com/ansible/awx-operator/tree/0.21.0#database-configuration
If it still doesn't work and we keep getting the same error in the awx-postgres-0 log, we will connect to the database:
# kubectl exec -it awx-postgres-0 -- psql -U awx
We change the password to the one we have specified in the previous secrets:
# ALTER USER awx WITH PASSWORD 'yournewpass';
At this point you should be able to connect the awx pod.
If it still doesn't work and we keep getting the same error in the awx-postgres-0 log, we will connect to the database:
# kubectl exec -it awx-postgres-0 -- psql -U awx
We change the password to the one we have specified in the previous secrets:# ALTER USER awx WITH PASSWORD 'yournewpass';
At this point you should be able to connect the awx pod.
This kinda helped and kinda didn't. I'm on a newer version operator 2.6.0 and I had a failed version upgrade which for some reason left me with the database connection not working like this. I'm using a rancher kubernetes cluster so my setup maybe different. Commenting here my experience to possibly help others.
I followed your instructions here to change the database connection password which worked in that it changed the password that I didn't know on the db to a new password. I was not able to get awx to connect using the new password though. I was however able to find that the database config (unmanaged config with all db info, password, host, user, and port) were all automatically stored in a secret in rancher. I was not able to find that info inside the awx container file structure where you indicated on your earlier comment.
Once I found the login details in the stored secret I was able to restore my awx to working order by updating my awx helm values.yml. Removing the default postgres values for unmanaged instances and adding the postgres secret to the spec portion of the values.yml.
---
spec:
...
postgres_configuration_secret: <name-of-your-secret>
Then simply delete (uninstall) awx and reinstall using the new helm values. NOTICE: I did have to go back into the awx postgres database and change the awx user password back to the value that was set in the postgres secret before it actually worked. Changing the password does not seem to work, do not change the password.