awx-operator
awx-operator copied to clipboard
AWX migration to external postgres fails
Please confirm the following
- [X] I agree to follow this project's code of conduct.
- [X] I have checked the current issues for duplicates.
- [X] I understand that the AWX Operator is open source software provided for free and that I might not receive a timely response.
Bug Summary
When attempting to migrate from AWX on vm to AWX in k8s with an externally managed postgres cluster, it fails.
AWX Operator version
awx-operator:2.1.0
AWX version
latest
Kubernetes platform
kubernetes
Kubernetes/Platform version
v1.24.8 +rke2r1
Modifications
no
Steps to reproduce
Deploy
---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
name: awx
namespace: awx
spec:
secret_key_secret: awx-secret-key
old_postgres_configuration_secret: awx-old-postgres-configuration
postgres_configuration_secret: awx-postgres-configuration
projects_persistence: true
projects_storage_class: linstor-replica-three
projects_storage_size: 10Gi
projects_storage_access_mode: ReadWriteOnce
admin_password_secret: awx-admin-password
admin_user: admin
ingress_annotations: |
"kubernetes.io/ingress.class": "traefik"
"traefik.ingress.kubernetes.io/router.entrypoints": "websecure"
"traefik.ingress.kubernetes.io/router.tls": "true"
ingress_path: /
ingress_path_type: Prefix
ingress_tls_secret: example-com-cf
hostname: awx.example.com
---
apiVersion: v1
kind: Secret
metadata:
name: awx-postgres-configuration
namespace: awx
stringData:
host: acid-awx-pooler.awx-postgres
port: "5432"
database: awx
username: awx
password: password
type: unmanaged
sslmode: require
type: Opaque
---
apiVersion: v1
kind: Secret
metadata:
name: awx-old-postgres-configuration
namespace: awx
stringData:
host: "ip"
port: "5432"
database: awx
username: awx
password: password
type: Opaque
Expected results
I expect it to use the already deployed postgres cluster
Actual results
Fails to work. See log file
Which leads me to: https://github.com/ansible/awx-operator/blob/devel/roles/installer/tasks/migrate_data.yml#L31
It seems like when you migrate, it is expecting to manage postgres?
Additional information
https://github.com/ansible/awx-operator/issues/1240
Creating new issue as requested
Operator Logs
awx-operator-controller-manager-58b5b7698b-4c2x6_awx-manager.log
I'm encountering the same issue. Trying to migrate from an old awx 17 VM/psql to an Operator-controlled deployment with external psql (because no to databases on k8s...).
awx-operator:2.3.0 old psql: postgres:12 new psql: 14.8 (Ubuntu package)
AWX:
kind: AWX
metadata:
name: awx
namespace: awx
spec:
service_type: ClusterIP
ingress_type: ingress
ingress_class_name: nginx
ingress_annotations: |
nginx.ingress.kubernetes.io/affinity: "cookie"
nginx.ingress.kubernetes.io/session-cookie-name: "awx-kube-sticky"
nginx.ingress.kubernetes.io/session-cookie-expires: "172800"
nginx.ingress.kubernetes.io/session-cookie-max-age: "172800"
hostname: awx1.[example.com]
secret_key_secret: awx-secret-key
admin_user: admin
admin_email: administrator@[example.com]
admin_password_secret: awx-admin-password
postgres_configuration_secret: awx-postgres-configuration
ipv6_disabled: true
image_pull_policy: Always
awx-old-postgres-configuration:
apiVersion: v1
kind: Secret
metadata:
name: awx-old-postgres-configuration
namespace: awx
stringData:
host: "awx.[example.com]"
port: "5432"
database: "db"
username: "user"
password: "pw"
type: Opaque
awx-postgres-configuration:
apiVersion: v1
kind: Secret
metadata:
name: awx-postgres-configuration
namespace: awx
stringData:
host: postgresql.[example.com]
port: "5432"
database: db2
username: user2
password: pw2
sslmode: prefer
type: unmanaged
type: Opaque
operator log: op-log.txt
As a workaround, does awx/the operator care if I do the migration to a managed DB via a copy of my secret with a different name and set to managed, move that db to my intended db server and then just change the secret back to my current secret, hopefully causing a deployment update?
@rcarrillocruz I think you got this report from another sources, are you tracking this one?
I have manually setup new AWX. After finally getting AWX up and running with external postgres, I attempted another migration and it failed. So just manually setup templates, creds and inventory+project.
@jonathon2nd I have a AWX in 9.3.0 which I managed to migrate to AWX Operator and run version 23.3.0. I did have to jump thru different versions but I found it easier to have a Bitnami Postgresl deployed and run the database as external for the new AWX, this way I can manage to do a sql import from a dump of the old database.
So, try this:
- Deploy AWX Operator
- Deploy Bitnami Postgresl, I have tried 13 to 15 and works
- Create a Kubernetes Job that pull a dump from the old darabase and imports into the new Bitnami. I have a custom inage that does this for hourly backups of the database and sends to S3 bucket, also have another command to restore from latest or given backup in S3.
- Once imported sql dump, deploy AWX with the external Bitnami database config for the connection.
- Watch the migration to see the progress.
This method has work for me multiple times. I even tried using Bitnami Postgresl HA but pgpool doesnt play nicely with django cursor.py.
This looks similar to what was happening here in the pulp/galaxy operator:
- https://github.com/pulp/pulp-operator/pull/1027/files
@Zokormazo this what the issue you were thinking of that @rcarrillocruz was working on.
@jonathon2nd The tricky thing here though is that the migration logic makes the assumption that you are migrating to a managed db in a k8s pod:
- https://github.com/ansible/awx-operator/blob/devel/roles/installer/tasks/migrate_data.yml#L60
I think that to add support for using this migration logic for external postgres instances, we would need to do something like:
- create a management pod (maybe managed by a k8s job)
- have that management pod connect to the old postgres instance via psql, then redirect the data stream of the pg_dump to a pg_restore in the new postgres instance.
There are some tricky issues to work through though:
- You may need to temporarily write the pg_backup to a a tmpfile on the management pod, the cat it when piping to the pg_restore. For large databases, it might use up all of the ephemeral storage on the cluster if a PVC is not used to back this temporary write.
- You would need to connect to the new pg instance as the
postgres
user, or pg admin user, with database set to template1 I think.
These are just some thoughts and recollections from previous experimenting, not to be taken as fact. But if you want to take a crack at implementing this, a PR would certainly be welcome.
At the moment, if you migrate clusters and use an external postgres, it is on the user to migrate the data from the old postgres to the new one.
Is this issue still open?
Just tried to install AWX via operator with external postgres. It seems like AWX does not create any tables with external managed postgres. The awx operator creates the migration jobs but in my fresh bootstrapped awx database i have no tables and awx / operator doesn't create any tables!
You should either create the tables you need if the don't exist or expand your documentation with a DB bootstrap file ...
Regards, Kilian
I have faced exact same problem yesterday, when just trying to migrate from the built-in managed postgresql database to external unmanaged database. This is because the code is currently written in such a way, that the target for pg_restore is always a managed database inside k8s container where both pg_dump and pg_restore are called.
There are currently 3 issues that would beed to be fixed in order for this to work:
- always specify -h parameter (host) for both pg_dump and pg_restore and do not do implicit trust auth against target in pg_restore (simple)
- maybe improve handling of credentials, as current method relies on environment variables in postgresql pod (especially old PG password). This can be solved either by re-creating postgresql statefulset (the only way that it adds new env variable after specyfing old_postgres_configuration_secret), or use rendered .pgpass file inside container to do the authentication and remove it afterwards
- for the moment either the old or the current postgresql database HAS TO BE a managed database, as all the code runs against a k8s container. I do not see an easy way to handle scenatios when migrating from one unmanaged to another unmanaged database.
In terms of points 1-2 i can help and even provide a PR that will fix it should this be wanted.
Actually when I think about this once again, even point 3 would be doable with additional management pod based on postgresql image having both pg_dump and pg_restore. It might even solve problems 1-2 as well, as it could have all needed env variables since it needs to be created anyway, and it would have to connect to both databases since none of them would be running inside of it.
Just so I'm clear, are you saying that there is no way to spin up a new AWX instance that uses a new external postgresql install (DB/user/pass created)? I'd have to create a new AWX instance with a managed database, then do a DB migration to the external DB?
It seems that you guy are really over complicating things.
Look do this. Create a job or deployment with pgsl in the same namespace as AWX with persistent volume. Open a shell in the container, dump the awx database. Remove awx deployment. Deploy postgres from Bitnami or your favorite. Dont forget to create a secret that awx will use for the external connection. Connect to the container used for the dump, connect to the new bitnami db and restore the db. Create awx with new external db. Everything should be restored as normal.
I have all this process automated to dump my db horly using k8s cronjob and uplod to S3 and for restores I can pull from S3 and restore the db using a job in k8s.
I have only restored 1 or 2 times when the pvc became corrupted at the storage level.
Please reach out if you need some guidance.
Just so I'm clear, are you saying that there is no way to spin up a new AWX instance that uses a new external postgresql install (DB/user/pass created)? I'd have to create a new AWX instance with a managed database, then do a DB migration to the external DB?
Actually exactly the opposite. It is not possible to migrate to external pg database with the built-in workflow in AWX operator. I think it should be just fine to setup a completely new instance with external postgresql but did not look into that part of code. For sure however having a localy awx-managed postgresql database it is not possible to use the built in migration workflow to migrate to an external database.
And what @chinochao stated is correct - this is what I actyually did. I execed into the running managed postgresql database, manually started pg_dump ... | pg_restore ... to migrate to external database and then updated resource definition to point only to external database. This ticket is however about migration to external database with built-in workflow failing.
As I already said - it should be possible without much effort to migrate between internal and external database in any direction. Just one -h parameter to pg_restore would fix it. In order to be able to migrate between 2 external databases though one would need extra job container and a bit more code changes...