ansible-role-rke2 icon indicating copy to clipboard operation
ansible-role-rke2 copied to clipboard

bug: restore etcd from snapshot not working

Open cyrilschaal opened this issue 3 years ago • 1 comments

Summary

When trying to restore etcd snapshot, the block "Restore etcd" in the first_server.yml file is skipped due to the following condition : 'and ( "rke2-server.service" is not in ansible_facts.services )'.

Trying the without this second condition and it's working, only with 'when: rke2_etcd_snapshot_file'.

Issue Type

Bug Report

Ansible Version

ansible [core 2.13.3]
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/exploit/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/exploit/.local/lib/python3.8/site-packages/ansible
  ansible collection location = /home/exploit/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/exploit/.local/bin/ansible
  python version = 3.8.10 (default, Jun 22 2022, 20:18:18) [GCC 9.4.0]
  jinja version = 3.1.2
  libyaml = True

Steps to Reproduce


Expected Results

the snapshot should be restore

Actual Results

the snapshot is not restore

cyrilschaal avatar Oct 04 '22 14:10 cyrilschaal

I think I have found in the first_server.yml a syntax error occuring in line 58.

when: rke2_etcd_snapshot_file and ( "rke2-server.service" is not in ansible_facts.services )

fixed by adding double-quotes to ansible_facts.service.

when: rke2_etcd_snapshot_file and ( "rke2-server.service" is not in "ansible_facts.services" )

Maybe this will help you out.

ot-random1 avatar Oct 24 '22 15:10 ot-random1

Hi @cyrilschaal, the purpose of rke2_etcd_snapshot_file is to deploy a new RKE2 cluster from etcd snapshot. So, the condition makes sure that the etcd will not be restored on already deployed RKE2 cluster.

MonolithProjects avatar Oct 27 '22 15:10 MonolithProjects

@ot-random1 nope the condition is OK. It prevents to run the etcd restore process on already deployed cluster. In your version you are comparing two strings which are not equal - that's why you made the condition to be true.

MonolithProjects avatar Oct 27 '22 15:10 MonolithProjects

Hi @MonolithProjects. I understand the purpose of rke2_etcd_snapshot_file but on a fresh RKE2 (not an already deployed cluster) the condition is always false and it's not possible to deploy a new cluster from an etcd backup

cyrilschaal avatar Oct 28 '22 09:10 cyrilschaal

Hi, in my case Ansible reported a syntax error. Something like: expecting ")" but got ".". I am Using Ansible 2.12.10 at the moment.

ot-random1 avatar Oct 28 '22 09:10 ot-random1

@cyrilschaal thanks for info. Will check it

MonolithProjects avatar Oct 28 '22 09:10 MonolithProjects

I have noticed that the check is incorrect, because the ansible_facts.services reports the service as existing, but inactive & disabled

	'rke2-agent.service': {
		'name': 'rke2-agent.service',
		'state': 'inactive',
		'status': 'disabled',
		'source': 'systemd'
	},
	'rke2-server.service': {
		'name': 'rke2-server.service',
		'state': 'inactive',
		'status': 'disabled',
		'source': 'systemd'
	}

I can send a PR that checks for the status = 'enabled' instead of simply checking if service is present

tchinmai7 avatar Mar 15 '23 05:03 tchinmai7