ansible-role-rke2
ansible-role-rke2 copied to clipboard
bug: restore etcd from snapshot not working
Summary
When trying to restore etcd snapshot, the block "Restore etcd" in the first_server.yml file is skipped due to the following condition : 'and ( "rke2-server.service" is not in ansible_facts.services )'.
Trying the without this second condition and it's working, only with 'when: rke2_etcd_snapshot_file'.
Issue Type
Bug Report
Ansible Version
ansible [core 2.13.3]
config file = /etc/ansible/ansible.cfg
configured module search path = ['/home/exploit/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /home/exploit/.local/lib/python3.8/site-packages/ansible
ansible collection location = /home/exploit/.ansible/collections:/usr/share/ansible/collections
executable location = /home/exploit/.local/bin/ansible
python version = 3.8.10 (default, Jun 22 2022, 20:18:18) [GCC 9.4.0]
jinja version = 3.1.2
libyaml = True
Steps to Reproduce
Expected Results
the snapshot should be restore
Actual Results
the snapshot is not restore
I think I have found in the first_server.yml a syntax error occuring in line 58.
when: rke2_etcd_snapshot_file and ( "rke2-server.service" is not in ansible_facts.services )
fixed by adding double-quotes to ansible_facts.service.
when: rke2_etcd_snapshot_file and ( "rke2-server.service" is not in "ansible_facts.services" )
Maybe this will help you out.
Hi @cyrilschaal, the purpose of rke2_etcd_snapshot_file is to deploy a new RKE2 cluster from etcd snapshot. So, the condition makes sure that the etcd will not be restored on already deployed RKE2 cluster.
@ot-random1 nope the condition is OK. It prevents to run the etcd restore process on already deployed cluster. In your version you are comparing two strings which are not equal - that's why you made the condition to be true.
Hi @MonolithProjects. I understand the purpose of rke2_etcd_snapshot_file but on a fresh RKE2 (not an already deployed cluster) the condition is always false and it's not possible to deploy a new cluster from an etcd backup
Hi, in my case Ansible reported a syntax error. Something like: expecting ")" but got ".". I am Using Ansible 2.12.10 at the moment.
@cyrilschaal thanks for info. Will check it
I have noticed that the check is incorrect, because the ansible_facts.services reports the service as existing, but inactive & disabled
'rke2-agent.service': {
'name': 'rke2-agent.service',
'state': 'inactive',
'status': 'disabled',
'source': 'systemd'
},
'rke2-server.service': {
'name': 'rke2-server.service',
'state': 'inactive',
'status': 'disabled',
'source': 'systemd'
}
I can send a PR that checks for the status = 'enabled' instead of simply checking if service is present