awx-operator icon indicating copy to clipboard operation
awx-operator copied to clipboard

Unset k8s secret ownerRefs earlier in reconciliation loop

Open rooftopcellist opened this issue 2 years ago • 1 comments

  • Remove ownerRefs earlier in the reconciliation loop so that they always get removed, even if an error short-circuits the reconciliation loop before the cleanup playbook is run.
SUMMARY

Currently, if the AWX CR is deleted before the reconciliation loop completes, or if it errors before the cleanup task, it is possible to get in a state where the generated k8s secrets are deleted.

This may sound innocuous at a glance, but when you consider that the postgres PVC is left behind and has already created the awx postgres user with the original pg_password valued, you can see how a re-deploy witht he same deployment_name would result in an AWX instance that cannot connect to the database.

This should make it much harder to get in to that state.

ISSUE TYPE
  • Bug, Docs Fix or other nominal change
ADDITIONAL INFORMATION

Work-around

Note: If you are already in this state, it is possible to get the pg_password value from the generated postgres_configuration_secret, exec into the postgresql pod, and change the awx user's password.

rooftopcellist avatar Apr 11 '23 21:04 rooftopcellist

This currently fails because of this error, which makes no sense as this would have been needed for the existing tasks in cleanup.yml..

 TASK [Remove ownerReferences] ******************************** 
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Failed to retrieve requested object: b'{\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metadata\":{},\"status\":\"Failure\",\"message\":\"secrets \\\\\"{\\'changed\\': False, \\'resources\\': [{\\'apiVersion\\': \\'v1\\', \\'data\\': {\\'password\\': \\'M3R1M3FiM1JxSzRXS3dzalRMSDVEWjFybnVFVDBjMmw=\\'}, \\'kind\\': \\'Secret\\', \\'metadata\\': {\\'annotations\\': {\\'kubectl.kubernetes.io\\\\\" is forbidden: User \\\\\"system:serviceaccount:awx:awx-operator-controller-manager\\\\\" cannot get resource \\\\\"secrets/last-applied-configuration\\': \\'{\\\\\\\\\\\\\"apiVersion\\\\\\\\\\\\\":\\\\\\\\\\\\\"v1\\\\\\\\\\\\\",\\\\\\\\\\\\\"kind\\\\\\\\\\\\\":\\\\\\\\\\\\\"Secret\\\\\\\\\\\\\",\\\\\\\\\\\\\"metadata\\\\\\\\\\\\\":{\\\\\\\\\\\\\"labels\\\\\\\\\\\\\":{\\\\\\\\\\\\\"app.kubernetes.io\\\\\" in API group \\\\\"\\\\\" in the namespace \\\\\"awx\\\\\"\",\"reason\":\"Forbidden\",\"details\":{\"name\":\"{\\'changed\\': False, \\'resources\\': [{\\'apiVersion\\': \\'v1\\', \\'data\\': {\\'password\\': \\'M3R1M3FiM1JxSzRXS3dzalRMSDVEWjFybnVFVDBjMmw=\\'}, \\'kind\\': \\'Secret\\', \\'metadata\\': {\\'annotations\\': {\\'kubectl.kubernetes.io\",\"kind\":\"secrets\"},\"code\":403}\\n'", "reason": "Forbidden"}

Adding that rule to the awx-operator-controller-manager serviceAccount did not even solve the issue...

rooftopcellist avatar Apr 11 '23 22:04 rooftopcellist