crmsh icon indicating copy to clipboard operation
crmsh copied to clipboard

bootstrap multiple corosync links in the interactive mode enters to the dead loop

Open zzhou1 opened this issue 9 months ago • 4 comments

Here is the environment.

adm@tw-1:~> sudo ip r
default via 192.168.100.1 dev enp1s0 proto dhcp src 192.168.100.179 metric 101 
default via 192.1.160.1 dev enp8s0 proto static metric 20103 
default via 192.2.160.1 dev enp9s0 proto static metric 20104 
127.0.0.0/8 dev lo proto kernel scope link src 127.0.0.1 metric 30 
192.1.160.0/24 dev enp8s0 proto kernel scope link src 192.1.160.101 metric 103 
192.2.160.0/24 dev enp9s0 proto kernel scope link src 192.2.160.101 metric 104 
192.3.160.0/24 dev enp12s0 proto kernel scope link src 192.3.160.101 metric 100 
192.168.0.0/16 dev enp1s0 proto kernel scope link src 192.168.100.179 metric 101 
192.168.160.0/24 dev enp7s0 proto kernel scope link src 192.168.160.101 metric 102 
adm@tw-1:~> 

bootstrap enters a dead loop?

adm@tw-1:~> sudo -E crm cluster init
INFO: Loading "default" profile from /etc/crm/profiles.yml
INFO: Loading "knet-default" profile from /etc/crm/profiles.yml
INFO: Adding public keys to authorized_keys for user root...
INFO: Added public key 2048 SHA256:Ch61e2yqy5Tkw08UEYeNsi9YXBWoujjtYncF+XbuJ3w /home/zzhou/.ssh/id_rsa (RSA).
INFO: A public key is added to authorized_keys for user hacluster: 3072 SHA256:IqRvCzQbjjKiCy5z2IUfJHZO37U+QqhPKNeoh9fIEE0 Cluster internal on tw-1 (RSA)
INFO: Added firewalld service high-availability on tw-1
INFO: Configuring csync2
INFO: Starting csync2.socket service on tw-1
INFO: BEGIN csync2 checking files
INFO: END csync2 checking files
Address for ring0 [192.168.100.179]192.1.160.101
Address for ring1 [192.168.100.179]192.2.160.101
Address for ring2 [192.168.100.179]
Address for ring3 []
Address for ring3 []
Address for ring3 []
Address for ring3 []
Address for ring3 []Ctrl-C, leaving

dead loop, again ...

Address for ring0 [192.168.100.179]192.1.160.101
Address for ring1 [192.168.100.179]
Address for ring2 []
Address for ring2 []
Address for ring2 []
Address for ring2 []
Address for ring2 []
Address for ring3 []Ctrl-C, leaving

However, it turns to the normal situation when configure more links

adm@tw-1:~> sudo -E crm cluster init
INFO: Loading "default" profile from /etc/crm/profiles.yml
INFO: Loading "knet-default" profile from /etc/crm/profiles.yml
INFO: Adding public keys to authorized_keys for user root...
INFO: Added public key 2048 SHA256:Ch61e2yqy5Tkw08UEYeNsi9YXBWoujjtYncF+XbuJ3w /home/zzhou/.ssh/id_rsa (RSA).
INFO: A public key is added to authorized_keys for user hacluster: 3072 SHA256:IqRvCzQbjjKiCy5z2IUfJHZO37U+QqhPKNeoh9fIEE0 Cluster internal on tw-1 (RSA)
INFO: Added firewalld service high-availability on tw-1
INFO: Configuring csync2
INFO: Starting csync2.socket service on tw-1
INFO: BEGIN csync2 checking files
INFO: END csync2 checking files
Address for ring0 [192.168.100.179]192.1.160.101
Address for ring1 [192.168.100.179]192.2.160.101
Address for ring2 [192.168.100.179]192.3.160.101
Address for ring3 [192.168.100.179]192.168.160.101
Address for ring4 [192.168.100.179]
INFO: Configure SBD:

hoops, not really, dead loop, again

Address for ring0 [192.168.100.179]
Address for ring1 []192.1.160.101
Address for ring2 []192.2.160.101
Address for ring3 []192.3.160.101
Address for ring4 []
Address for ring4 []
Address for ring4 []
Address for ring3 []Ctrl-C, leaving

zzhou1 avatar Jul 07 '25 09:07 zzhou1

@nicholasyang2022 Please take a look when you have time Thanks!

liangxin1300 avatar Jul 07 '25 11:07 liangxin1300

One of reason is the existing crm.conf jeopardize the bootstrap init procedure.

adm@tw-1:~> sudo grep force /root/.config/crm/crm.conf
force = true

zzhou1 avatar Jul 08 '25 05:07 zzhou1

BTW, even with force = true in crm.conf, one of the possible user experience improvement could be to print those auto-confirmation during the process as shown below. That could be the possible indicator/hint for force= option in crm.conf.

csync2 is already configured - overwrite (y/n)? y
/etc/corosync/authkey already exists - overwrite (y/n)? y
/etc/corosync/corosync.conf already exists - overwrite (y/n)? y
Add another ring (y/n)? y

zzhou1 avatar Jul 08 '25 05:07 zzhou1

Should we stop saving force in crm.conf? I think it should apply only to currrent running command, and should never be saved.

nicholasyang2022 avatar Oct 13 '25 06:10 nicholasyang2022