xcat-core
xcat-core copied to clipboard
rcons / goconserver from remote host.
We have a xCAT 2.16.5 deployment on RHEL8 w/Red Hat High Availabiliy (pacemaker) with 3 xCAT manager nodes in the HA Pacemaker cluster. Everything is working fine by setting the following in /root/.bash_profile on each of the HA nodes:
export XCATHOST=172.20.0.1:3001
export CONSERVER=172.20.0.1
where 172.20.0.1 is the floating IP address manged by Pacemaker, except that when I use rcons from the non-active management node, it connects with
ssh -t 172.20.0.1 /opt/xcat/share/xcat/cons/ipmi a0n13
instead of congo
[root@atmos-mgmt4 ~]# pcs resource status goconserver
* goconserver (systemd:goconserver.service): Started atmos-mgmt4
[root@atmos-mgmt4 ~]# echo $CONSERVER
172.20.0.1
[root@atmos-mgmt4 ~]# rcons a0n13
[Enter `^Ec?' for help]
goconserver(2023-05-08T11:54:24-04:00): Hello 172.20.0.4:34740, welcome to the session of a0n13
[root@a0n13 ~]# [Disconnected]
Connection to 172.20.0.1 closed.
[root@atmos-mgmt3 ~]# rcons a0n13
**** Enter ~? for help *****
Acquiring startup lock...done
[SOL Session operational. Use ~? for help]
[root@a0n13 ~]# Connection to atmos-mgmt3 closed.
[bdk@albert ~]$
While it works with /opt/xcat/share/xcat/cons/ipmi, the problem is, the "disconnect" string is ~., the same as the SSH disconnect string, so when I try and disconnect from the rcons session, it also kicks me out of the SSH session to the HA node.
Looking at /opt/xcat/bin/rcons, it seems the only method used to determine if rcons is conserver or goconserver is via checking if goconserver is running or not:
GOCONSERVER_RC=`service goconserver status >& /dev/null; echo $?`
if [[ ${GOCONSERVER_RC} == 0 ]]; then
USE_GOCONSERVER=1
fi
If I manually change /opt/xcat/bin/rcons so that
USE_GOCONSERVER=1
Then it uses
ssh -t 172.20.0.1 /usr/bin/congo console a0n13
and works as expected:
[root@atmos-mgmt3 ~]# pcs resource status goconserver
* goconserver (systemd:goconserver.service): Started atmos-mgmt4
[root@atmos-mgmt3 ~]# rcons a0n13
[Enter `^Ec?' for help]
goconserver(2023-05-08T12:07:35-04:00): Hello 172.20.0.4:41770, welcome to the session of a0n13
[root@a0n13 ~]# [Disconnected]
Connection to 172.20.0.1 closed.
[root@atmos-mgmt3 ~]#
So, my question is, is there a trick (Environment variable, etc) that I'm not seeing to force rcons to use the congo method, instead of /opt/xcat/share/xcat/cons/ipmi, when connecting from a remote host? I can make the code change to rcons simple enough on my 3 HA nodes, but if there is a better way to do it, I'd like to use that method.
If there isn't a better way, and since all the code to support congo or ipmi cons via SSH is already in rcons, it seems a reasonable RFE to make it so the end user can force the use of conserver / ipmi vs goconserver / congo, instead of relying on pidof and assuming you're running rcons on the same node conserver/goconserver is running on.
Thanks.
Every level of nested remote access adds a ~ to the disconnect string to send. In case you want to disconnect from the rcons connected via ssh, please use ~~. . This also applies to multiple levels of SSH:
For another example, rcons in an ssh console via a jumphost in the middle ( laptop =ssh=> jumphost =ssh=> MN =rcons=> node-69) use ~~~. to disconnect the rcons. Please note that this is a standard of ssh.
Thanks, that gets me around the ipmi disconnect problem, but I'd love to have a way to specify/force the use of congo instead, since it does work if I just set
USE_GOCONSERVER=1
in the rcons bash script.
Why don't you set it locally and then have ssh send it along with the ssh command, like:
export USE_GOCONSERVER=1
ssh -o SendEnv=USE_GOCONSERVER -t 172.20.0.1 /usr/bin/congo console a0n13
Alternatively:
ssh -t 172.20.0.1 USE_GOCONSERVER=1 /usr/bin/congo console a0n13
That doesn't really make sense. The USE_GOCONSERVER is a setting inside rcons, if I run congo directly from SSH myself I don't need USE_GOCONSERVER. But running congo as above manually is a lot more complicated then just doing
rcons a0n13
My goal is to do this the xCAT way, I could write my own simple / stripped down rcons script, or just copy rcons to myrcons, make the code change and then use that going forward, or just change rcons directly and make the same update each time a new version of xCAT is released (we do that now with dhcp.pm and ddns.pm, so xCAT will use something other then MD5 with omshell, so makehosts and makedns will work correctly under FIPS).
But ultimately, it would be nice if there was a way to force it to use congo. There are already overrides in rcons for confluent
CONSOLE_SERVICE_KEYWORD=`tabdump site | grep consoleservice | cut -d, -f1 | tr -d '"'`
CONSOLE_SERVICE_VALUE=`tabdump site | grep consoleservice | cut -d, -f2 | tr -d '"'`
if [ "$CONSOLE_SERVICE_KEYWORD" == "consoleservice" ]; then
if [ "$CONSOLE_SERVICE_VALUE" == "confluent" ]; then
USE_CONFLUENT=1
fi
fi
So CONSOLE_SERVICE_KEYWORD is already being pulled, checked and used to force consoleserver and confluent. A simple addition to the above if statement, like the following
elsif [ "$CONSOLE_SERVICE_KEYWORD" == "goconsoleservice" ]; then
USE_GOCONSERVER=1
fi
should allow a site table override and to force congo be used, but the only checks I see in rcons for if to use congo are based on checking if the goconserver service is installed AND running
if [ $USE_CONFLUENT != "1" ] && [ -f "/usr/bin/congo" ] && [ -f "/usr/bin/goconserver" ]; then
GOCONSERVER_RC=`service goconserver status >& /dev/null; echo $?`
if [[ ${GOCONSERVER_RC} == 0 ]]; then
USE_GOCONSERVER=1
fi
if [[ ${USE_GOCONSERVER} == 1 ]]; then
CONSERVER_RC=`pidof conserver >> /dev/null; echo $?`
if [[ ${CONSERVER_RC} == 0 ]]; then
echo "Error: Both goconserver and conserver are running, please stop one of them, and retry..."
exit 1
fi
fi
fi
That doesn't really make sense. The
USE_GOCONSERVERis a setting insidercons, if I runcongodirectly from SSH myself I don't needUSE_GOCONSERVER. But runningcongoas above manually is a lot more complicated then just doingrcons a0n13
@bviviano I see the problem now, after examining the code for rcons for xcat 2.16, which I should have done first, before proposing that non-solution :man_facepalming: