clustershell
clustershell copied to clipboard
clush doesn't see groups.d definitions?
I installed clush via conda in a environment. All config files are under <conda env location>/clush/etc/clustershell
In my groups.conf I have
[Main]
default: roles
confdir: /etc/clustershell/groups.conf.d $CFGDIR/groups.conf.d
autodir: /etc/clustershell/groups.d $CFGDIR/groups.d
Next I have groups.d/cluster.yaml where I have my node definitions.
The syntax of the file is fine because running nodeset -LL shows all my definitions.
If I run
clush -a date
I expect to get the date from all machines.
Instead I get
Usage: clush [options] command
clush: error: No node to run on.
so what am I missing here?
Thanks for reporting that issue.
Could you run clush with -d and -v option to collect more debugging
logs?
Also, nodeset -f -a should report the node list clush will be using when using -a. Is that nodelist the one you expect ?
Aurélien
Le 2024-01-30 11:33, Stefan Weber a écrit :
I installed clush via conda in a environment. All config files are under
/clush/etc/clustershell In my groups.conf I have
[Main] default: roles confdir: /etc/clustershell/groups.conf.d $CFGDIR/groups.conf.d autodir: /etc/clustershell/groups.d $CFGDIR/groups.d
Next I have groups.d/cluster.yaml where I have my node definitions. The syntax of the file is fine because running nodeset -LL shows all my definitions.
If I run
clush -a date
I expect to get the date from all machines.
Instead I get
Usage: clush [options] command
clush: error: No node to run on.
so what am I missing here?
-- Reply to this email directly, view it on GitHub [1], or unsubscribe [2]. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Links:
[1] https://github.com/cea-hpc/clustershell/issues/552 [2] https://github.com/notifications/unsubscribe-auth/AALO4OSIENST3363LTG5GZDYRDEAJAVCNFSM6AAAAABCQ7YAZWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGEYDONBZGM4TKMQ --=_ff31ca09d337529e7bcb747f1beb6874 Content-Type: multipart/related; boundary="=_af15c636883a006c0b7ee53cd2858fd0"
--=_af15c636883a006c0b7ee53cd2858fd0 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=UTF-8
Thanks for reporting that issue.
Could you run `clush` with -d and -v option to collect more debugging lo= gs?
Also, nodeset -f -a should report the node list clush will be using when= using -a. Is that nodelist the one you expect ?
Aurélien
Le 2024-01-30 11:33, Stefan Weber a écrit = ;:
I installed clush via
conda= code> in a environment. All config files are under<conda env location>/clush/etc/clustershellIn my
groups.confI ha= ve[M= ain] default: roles confdir: /etc/clustershell/groups.conf.d $CFGDIR/groups.conf.d autodir: /etc/clustershell/groups.d $CFGDIR/groups.dNext I have
groups.d/cluster.= yamlwhere I have my node definitions.
The syntax of the file i= s fine because runningnodeset -LLsho= ws all my definitions.If I run
clush -a dateI expect to get the date from all machines.
Instead I get
Usage: clush [options] commandclush: error: No node to run on.
so what am I missing here?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscr= ibed to this thread.Message I= D: <cea-hpc/clustershell/issues/552@gith= ub.com>
--=_af15c636883a006c0b7ee53cd2858fd0 Content-Transfer-Encoding: base64 Content-ID: @.***> Content-Type: image/gif; name=blocked.gif Content-Disposition: inline; filename=blocked.gif; size=118
R0lGODlhZAAyAIAAAPrOzgAAACH5BAAAAAAALAAAAABkADIAAAJNhI+py+0Po5y02ouz3rz7D4bi SJbmiabqyrbuC8fyTNf2jef6zvf+DwwKh8Si8YhMKpfMpvMJjUqn1Kr1is1qt9yu9wsOi8fksvls KwAAOw== --=_af15c636883a006c0b7ee53cd2858fd0--
--=_ff31ca09d337529e7bcb747f1beb6874--
Thanks for your immediate reply.
Here is the output:
$ clush -d -v -a date
DEBUG:root:clush: STARTING DEBUG
Adding nodes from option -a:
Usage: clush [options] command
clush: error: No node to run on.
and
$ nodeset -f -a
Obviously that is not what I would expect :)
Nodeset not returning what you expect points out that the nodegroup definition is incorrect.
Maybe you should have given nodeset -LL and cluster.yaml content for us to debug? :)
What is your 'all' definition ?
Well I have no explicit all definition.
To my understanding all should be everyting in roles since I have default: roles in groups.conf.
Ok, here is a simplified example:
My cluster.yaml
roles:
dev: '@dev:all'
dev:
dev: 'dev01'
with
$ nodeset -LL
@dev
@dev:dev dev01
The special name for all feature is not 'all', but '*'
You should use:
dev: @.**:'
See https://clustershell.readthedocs.io/en/latest/config.html#groups-config
and https://clustershell.readthedocs.io/en/latest/tools/nodeset.html#node-groups
I don't get your reference to dev: ***@***.***:*'
But it now works with the following.
My cluster.yaml should actually look like
roles:
dev: '@dev:*'
dev:
dev: 'dev01'
Ok the reference was (at least for me) a bit hard to find.
In https://clustershell.readthedocs.io/en/latest/tools/nodeset.html#working-with-range-sets it mentions that all is a external call. Then one actually should look at https://clustershell.readthedocs.io/en/latest/config.html#group-source-upcalls under External calls and see the description of all.
I think it would be good if a note is already under https://clustershell.readthedocs.io/en/latest/config.html#yaml-group-files.