awx-operator
awx-operator copied to clipboard
awx restore failed
ISSUE TYPE
- Bug Report
SUMMARY
ENVIRONMENT
- AWX version: awx:19.2.0
- Operator version: awx-operator:0.10.0
- Kubernetes version: minikube version: v1.21.0
- AWX install method: minikube on Linux
STEPS TO REPRODUCE
The AWX backup was successful, but the restore failed. Please tell me why the restore fails.
ADDITIONAL INFORMATION
# awx-deployment.yml
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
name: awx
spec:
service_type: nodeport
ingress_type: none
hostname: awx.qoo10jp.net
projects_persistence: false
projects_storage_class: awx-pv-vol1
projects_storage_access_mode: ReadWriteOnce
web_extra_volume_mounts: |
- name: static-data
mountPath: /var/lib/projects
extra_volumes: |
- name: static-data
persistentVolumeClaim:
claimName: awx-pv-vol1
postgres_storage_requirements:
requests:
storage: 5Gi
limits:
storage: 5Gi
postgres_storage_class: awx-pv-vol1
postgres_resource_requirements:
requests:
cpu: 300m
# cat awx-pvc-create.yml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: awx-pv-vol1
spec:
accessModes:
- ReadWriteMany
volumeMode: Filesystem
storageClassName: awx-pv-vol1
resources:
requests:
storage: 5Gi
# awx-pv-create.yml
apiVersion: v1
kind: PersistentVolume
metadata:
name: awx-pv-vol1
spec:
capacity:
storage: 5Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: awx-pv-vol1
hostPath:
path: "/data"
# postgres-pv-create.yml
apiVersion: v1
kind: PersistentVolume
metadata:
name: postgres-awx-postgres-0
spec:
capacity:
storage: 5Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: awx-pv-vol1
hostPath:
path: "/postgresdata"
# postgres-pvc-create.yml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-awx-postgres-0
spec:
accessModes:
- ReadWriteMany
volumeMode: Filesystem
storageClassName: awx-pv-vol1
resources:
requests:
storage: 5Gi
# awx-backup-pvc.yml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: awx-backup-claim
namespace: default
ownerReferences: null
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: '5Gi'
# awx-backup.yml
---
apiVersion: awx.ansible.com/v1beta1
kind: AWXBackup
metadata:
name: awxbackup
namespace: default
spec:
deployment_name: awx
backup_pvc: 'awx-backup-claim'
backup_storage_requirements: '5Gi'
backup_storage_class: 'standard'
postgres_label_selector: app.kubernetes.io/instance=postgres-awx
# awx-restore.yml
---
apiVersion: awx.ansible.com/v1beta1
kind: AWXRestore
metadata:
name: awxrestore
namespace: default
spec:
deployment_name: awx
backup_pvc_namespace: default
backup_name: awxbackup
AWX-OPERATOR LOGS
/restore/tasks/deploy_awx.yml:3\nok: [localhost] => {\"ansible_facts\": {\"_kind\": \"AWXRestore\"}, \"changed\": false}\n\r\nTASK [restore : Get AWX object definition from pvc] ****************************\r\ntask path: /opt/ansible/roles/restore/tasks/deploy_awx.yml:7\nchanged: [localhost] => {\"changed\": true, \"return_code\": 0, \"stderr\": \"\", \"stderr_lines\": [], \"stdout\": \"{admin_user: admin, api_version: awx.ansible.com/v1beta1, create_preload_data: True, deployment_type: awx, extra_volumes: - name: static-datan persistentVolumeClaim:n claimName: awx-pv-vol1n, garbage_collect_secrets: False, hostname: awx.qoo10jp.net, image_pull_policy: IfNotPresent, ingress_type: none, kind: AWX, loadbalancer_port: 80, loadbalancer_protocol: http, postgres_resource_requirements: {requests: {cpu: 300m}}, postgres_storage_class: awx-pv-vol1, postgres_storage_requirements: {limits: {storage: 5Gi}, requests: {storage: 5Gi}}, projects_persistence: False, projects_storage_access_mode: ReadWriteOnce, projects_storage_class: awx-pv-vol1, projects_storage_size: 8Gi, replicas: 1, route_tls_termination_mechanism: Edge, service_type: nodeport, task_privileged: False, web_extra_volume_mounts: - name: static-datan mountPath: /var/lib/projectsn}\\n\", \"stdout_lines\": [\"{admin_user: admin, api_version: awx.ansible.com/v1beta1, create_preload_data: True, deployment_type: awx, extra_volumes: - name: static-datan persistentVolumeClaim:n claimName: awx-pv-vol1n, garbage_collect_secrets: False, hostname: awx.qoo10jp.net, image_pull_policy: IfNotPresent, ingress_type: none, kind: AWX, loadbalancer_port: 80, loadbalancer_protocol: http, postgres_resource_requirements: {requests: {cpu: 300m}}, postgres_storage_class: awx-pv-vol1, postgres_storage_requirements: {limits: {storage: 5Gi}, requests: {storage: 5Gi}}, projects_persistence: False, projects_storage_access_mode: ReadWriteOnce, projects_storage_class: awx-pv-vol1, projects_storage_size: 8Gi, replicas: 1, route_tls_termination_mechanism: Edge, service_type: nodeport, task_privileged: False, web_extra_volume_mounts: - name: static-datan mountPath: /var/lib/projectsn}\"]}\n\r\nTASK [restore : Create temp file for spec dict] ********************************\r\ntask path: /opt/ansible/roles/restore/tasks/deploy_awx.yml:15\nchanged: [localhost] => {\"changed\": true, \"gid\": 0, \"group\": \"root\", \"mode\": \"0600\", \"owner\": \"ansible-operator\", \"path\": \"/tmp/ansible.1ebcxk56\", \"size\": 0, \"state\": \"file\", \"uid\": 1001}\n\r\nTASK [restore : Write spec vars to temp file] **********************************\r\ntask path: /opt/ansible/roles/restore/tasks/deploy_awx.yml:20\nchanged: [localhost] => {\"changed\": true, \"checksum\": \"6fd26d4821b4efe1ad68d75ae55dd40994e798f2\", \"dest\": \"/tmp/ansible.1ebcxk56\", \"gid\": 0, \"group\": \"root\", \"md5sum\": \"389a3d2c0b7d356101b212ea9d3a998f\", \"mode\": \"0644\", \"owner\": \"ansible-operator\", \"size\": 860, \"src\": \"/opt/ansible/.ansible/tmp/ansible-tmp-1650728829.548958-10015-250841844392712/source\", \"state\": \"file\", \"uid\": 1001}\n\r\nTASK [restore : Include spec vars to save them as a dict] **********************\r\ntask path: /opt/ansible/roles/restore/tasks/deploy_awx.yml:26\nfatal: [localhost]: FAILED! => {\"ansible_facts\": {}, \"ansible_included_var_files\": [], \"changed\": false, \"message\": \"We were unable to read either as JSON nor YAML, these are the errors we got from each:\\nJSON: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)\\n\\nSyntax Error while loading YAML.\\n expected the node content, but found '-'\\n\\nThe error appears to be in '/tmp/ansible.1ebcxk56': line 1, column 123, but may\\nbe elsewhere in the file depending on the exact syntax problem.\\n\\nThe offending line appears to be:\\n\\n\\n{admin_user: admin, api_version: awx.ansible.com/v1beta1, create_preload_data: True, deployment_type: awx, extra_volumes: - name: static-datan persistentVolumeClaim:n claimName: awx-pv-vol1n, garbage_collect_secrets: False, hostname: awx.qoo10jp.net, image_pull_policy: IfNotPresent, ingress_type: none, kind: AWX, loadbalancer_port: 80, loadbalancer_protocol: http, postgres_resource_requirements: {requests: {cpu: 300m}}, postgres_storage_class: awx-pv-vol1, postgres_storage_requirements: {limits: {storage: 5Gi}, requests: {storage: 5Gi}}, projects_persistence: False, projects_storage_access_mode: ReadWriteOnce, projects_storage_class: awx-pv-vol1, projects_storage_size: 8Gi, replicas: 1, route_tls_termination_mechanism: Edge, service_type: nodeport, task_privileged: False, web_extra_volume_mounts: - name: static-datan mountPath: /var/lib/projectsn}\\n ^ here\\n\"}\n\r\nPLAY RECAP *********************************************************************\r\nlocalhost : ok=22 changed=9 unreachable=0 failed=1 skipped=12 rescued=0 ignored=0 \r\n\n","job":"3627100269752912500","name":"awxrestore","namespace":"default","error":"exit status 2","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tpkg/mod/github.com/go-logr/[email protected]/zapr.go:128\ngithub.com/operator-framework/operator-sdk/pkg/ansible/runner.(*runner).Run.func1\n\tsrc/github.com/operator-framework/operator-sdk/pkg/ansible/runner/runner.go:239"}
--------------------------- Ansible Task Status Event StdOut -----------------
PLAY RECAP *********************************************************************
localhost : ok=22 changed=9 unreachable=0 failed=1 skipped=12 rescued=0 ignored=0
-------------------------------------------------------------------------------
{"level":"error","ts":1650728830.8202965,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"awxrestore-controller","request":"default/awxrestore","error":"event runner on failed","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tpkg/mod/github.com/go-logr/[email protected]/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tpkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tpkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\tpkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\tpkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\tpkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\tpkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\tpkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:90"}
Below is my awx_object. Isn't something weird?
{admin_user: admin, api_version: awx.ansible.com/v1beta1, create_preload_data: True, deployment_type: awx, extra_volumes: - name: static-data n persistentVolumeClaim: n claimName: awx-pv-vol1n, garbage_collect_secrets: False, hostname: awx.test.net, image_pull_policy: IfNotPresent, ingress_type: none, kind: AWX, loadbalancer_port: 80, loadbalancer_protocol: http, postgres_storage_class: awx-pv-vol1, postgres_storage_requirements: {limits: {storage: 8Gi}, requests: {storage: 8Gi}}, projects_persistence: False, projects_storage_access_mode: ReadWriteOnce, projects_storage_class: awx-pv-vol1, projects_storage_size: 8Gi, replicas: 1, route_tls_termination_mechanism: Edge, service_type: nodeport, task_privileged: False, web_extra_volume_mounts: - name: static-data n mountPath: /var/lib/projectsn}
I'm not sure why your restore didn't work however, I just made a copy of my AWX instance to a second namespace for testing. I mounted the PV to a container I created and copied the data off. I'm using Openshift so I used (oc rsync).
Inside the backup dir are 3 files. You only need the .db dump file to restore. You can then create the namespace. Create a secret to store your secret_key using the same one from your old setup. I did not create any other secrets or objects in the new namespace.
Go to your 2 deployments and scale them to zero so that only the postgres container is running. Next run these commands and kubectl should work instead of oc.
oc exec -it awx-postgres-0 -- dropdb -U awx awx oc exec -it awx-postgres-0 -- createdb -U awx awx oc exec -it awx-postgres-0 -- pg_restore --verbose -U awx -d awx < tower.db (replace the awx before postgres with your namespace name)
Start the two deployments back up to 1 and it everything should be fine once the web container is done connecting. I've done this several times and it works.
@dsim4 thank you for providing that workaround!
@sohwaje This issue has been fixed with the following PR:
- https://github.com/ansible/awx-operator/pull/1652
Could you try again with the latest operator devel image?
- quay.io/ansible/awx-operator:devel
If you are still experiencing the issue with the new devel image, please open a new issue. Thanks!