epiphany
epiphany copied to clipboard
[BUG] Unable to perform HA cluster backup
Describe the bug HA cluster backup fails with following error
16:43:33 INFO cli.engine.ansible.AnsibleCommand - TASK [backup : Save etcd snapshot] ******************************************************************************************************
16:43:35 ERROR cli.engine.ansible.AnsibleCommand - fatal: [ec2-3-95-156-60.compute-1.amazonaws.com]: FAILED! => {"changed": true, "cmd": "docker run -v \"/epibackup/ansible.zKwY0W.tmp/:/backup/\" --network host --env ETCDCTL_API=3 --rm \"ec2-3-95-156-60.compute-1.amazonaws.com:5000/k8s.gcr.io/etcd:3.4.3-0 ec2-3-95-156-60.compute-1.amazonaws.com:5000/k8s.gcr.io/etcd:3.4.3-0 ec2-3-95-156-60.compute-1.amazonaws.com:5000/k8s.gcr.io/etcd:3.4.3-0\" etcdctl --endpoints https://127.0.0.1:2379 --cacert /backup/pki/etcd/ca.crt --cert /backup/pki/etcd/healthcheck-client.crt --key /backup/pki/etcd/healthcheck-client.key snapshot save /backup/etcd-snapshot.db\n", "delta": "0:00:00.062874", "end": "2020-09-16 16:43:37.864740", "msg": "non-zero return code", "rc": 125, "start": "2020-09-16 16:43:37.801866", "stderr": "docker: invalid reference format.\nSee 'docker run --help'.", "stderr_lines": ["docker: invalid reference format.", "See 'docker run --help'."], "stdout": "", "stdout_lines": []}
To Reproduce Steps to reproduce the behavior:
- Deploy HA k8s cluster
- Try to execute
epicli backup -f backup.yml -b <build_dir>
Expected behavior Backup is completed successfully.
Config files backup.yml:
kind: configuration/backup
title: Backup Config
name: default
specification:
components:
load_balancer:
enabled: false
logging:
enabled: false
monitoring:
enabled: true
postgresql:
enabled: false
rabbitmq:
enabled: true
kubernetes:
enabled: true
OS (please complete the following information):
- OS: Ubuntu
Cloud Environment (please complete the following information):
- Cloud Provider: AWS
Additional context No.
DoD checklist
- Changelog
- [ ] updated
- [ ] not needed
- COMPONENTS.md
- [ ] updated
- [ ] not needed
- Schema
- [ ] updated
- [ ] not needed
- Backport tasks
- [ ] created
- [ ] not needed
- Documentation
- [ ] added
- [ ] updated
- [ ] not needed
- [ ] Feature has automated tests
- [ ] Automated tests passed (QA pipelines)
- [ ] apply
- [ ] upgrade
- [ ] backup/restore
- [ ] Idempotency tested
- [ ] All conversations in PR resolved
This is low priority task. We don't support recovery, users have to do it themselves. :)