clustershell icon indicating copy to clipboard operation
clustershell copied to clipboard

Node names get mangled when a node name contains one or more 0 characters

Open OleHolmNielsen opened this issue 4 years ago • 3 comments

We have two nodes listserv01 and listserv3, and the second node name listserv3 gets mangled when taken together with the listserv01 node:

$ clush -w listserv01,listserv3 uname -r listserv03: 2.6.32-754.30.2.el6.x86_64 listserv01: 4.18.0-193.6.3.el8_2.x86_64

When taken alone there is no issue with the listserv3 node name:

$ clush -w listserv3 uname -r listserv3: 2.6.32-754.30.2.el6.x86_64

(Note: I had to define a DNS CNAME alias listserv03 pointing to listserv3 as a workaround).

The incorrect leading zero gets added to all subsequent names starting with "listserv":

$ clush -w listserv01,listserv3,listserv2 uname -r listserv02: ssh: Could not resolve hostname listserv02: Name or service not known clush: listserv02: exited with exit code 255 listserv03: 2.6.32-754.30.2.el6.x86_64 listserv01: 4.18.0-193.6.3.el8_2.x86_64

Yet another example with "00":

$ clush -w a001,a2 uname -r a001: 3.10.0-1127.8.2.el7.x86_64 a002: 3.10.0-1127.8.2.el7.x86_64

So it seems that the presence of the "0" character triggers the present bug, where the zeroes get added incorrectly to other node names in the list.

We use this EPEL6 package: clustershell-1.8.3-1.el6_10.noarch and this EPEL7 package: clustershell-1.8.3-1.el7.noarch and this Fedora FC32 package: clustershell-1.8.3-2.fc32.noarch

OleHolmNielsen avatar Jun 29 '20 19:06 OleHolmNielsen

I think your ticket is a duplicate of #293 .

degremont avatar Jun 30 '20 08:06 degremont

Thanks! This is a pretty surprising bug.

OleHolmNielsen avatar Jun 30 '20 08:06 OleHolmNielsen

I've added references to this issue in my Slurm Wiki page https://wiki.fysik.dtu.dk/niflheim/SLURM#clustershell

OleHolmNielsen avatar Jun 30 '20 08:06 OleHolmNielsen

fixed in https://github.com/cea-hpc/clustershell/commit/5a41bc09f70309600c1a407d2bb3dd08f5d1ba65

thiell avatar Nov 20 '22 17:11 thiell