clustershell
clustershell copied to clipboard
Add --nodown Option to Exclude Nodes That Are Down
Add in an option that is similar to the -v option in pdsh. The idea is to skip all hosts that are currently down for one reason or another. The groups.conf file would have something like this in it:
[genders]
map: nodeattr -n $GROUP
all: nodeattr -n -A
list: nodeattr -l
down: whatsup -n -d || /bin/true
Example usage without -D:
> clush -a -b /path/to/command.sh
host8: mcmd: connect failed: No route to host
host12: mcmd: connect failed: No route to host
clush: host[8,12] (2): exited with exit code 1
---------------
host[1-7,9-11] (10)
---------------
Hello World
Example usage with -D:
> clush -D -a -b /path/to/command.sh
---------------
host[1-7,9-11] (10)
---------------
Hello World
Looks like I broke a test. Will fix. Thanks.
Hello. Would like to start a conversation about this pull request. Would like to give clush the ability as described here to skip nodes that are "down." On large clusters, there may be several nodes that are down regularly and would prefer not to send commands to them at all. This pull request is a start at that. Would love your feedback! Thanks!
Thanks for this PR. Few questions to understand this better
- is
whatsup
an independent tool or is it related to genders? Any pointer? - how would you apply that to Slurm group source by example?
- Do you have use cases where you want to run your command on all hosts (even those which you know are down)?
My experience running supercomputer indicates that the source giving the group of nodes is often a totally different one to the one giving the status of nodes. ie:
roles:
compute: node[1-3]
login: node[4-5]
[states]
map: nodehealth -f $GROUP
where you usually combine them with: clush -w @roles:compute&@states:up
or clush -w @roles:compute&@states:down
. That's why I'm wondering if having such default rules, per source is the right one.
There are only two hard things in Computer Science: cache invalidation and naming things.
I'm looking toward naming the option --up
. I prefer positive flags instead of double negation ("no" "down")
Thanks for this PR. Few questions to understand this better
- is
whatsup
an independent tool or is it related to genders? Any pointer?
whatsup is a tool that calculates what hosts are up/down in a cluster https://github.com/chaos/whatsup
- how would you apply that to Slurm group source by example?
Not sure there is a need. The "down" is optional. I am not a slurm expert, but you might do it like:
down: sinfo -h --dead -o "%N"
The note thing is that each site can define "down" for themselves.
- Do you have use cases where you want to run your command on all hosts (even those which you know are down)?
I think there may be times where we want to run the command everywhere and visibly see that we got an error connecting to that host. Can't think of a great reason at the moment, but probably just tired.
My experience running supercomputer indicates that the source giving the group of nodes is often a totally different one to the one giving the status of nodes. ie:
roles: compute: node[1-3] login: node[4-5]
[states] map: nodehealth -f $GROUP
where you usually combine them with:
clush -w @roles:compute&@states:up
orclush -w @roles:compute&@states:down
. That's why I'm wondering if having such default rules, per source is the right one.There are only two hard things in Computer Science: cache invalidation and naming things.
I'm looking toward naming the option
--up
. I prefer positive flags instead of double negation ("no" "down")
Totally fine with changing the name/polarity of the option if you prefer. I was just mapping this to the -v option in pdsh (and was trying to give it a similar feel). From pdsh manpage:
nodeupdown module options
-v Eliminate target nodes that are considered "down" by libnodeupdown.
I feel like we should go toward something like
- Having a proper source for up/down (or other states)
[whatsup]
list: echo -e "up\ndown"
map: if [ $GROUP == "up" ]; then whatsup --up; elif [ $GROUP == "down" ]; then whatsup --down; fi; true
Then find a proper way to declare the upcall, maybe in clush.conf
. Easier to do, but not available to library? Like:
downnodes: @whatsup:down
And add to clush
--up: excludes known "down" nodes using `downnodes` callback.
I was just mapping this to the -v option in pdsh (and was trying to give it a similar feel). From pdsh manpage:
In my thinking, running on up
nodes was the same than skipping down nodes. So the same feel than what you're doing with pdsh, just using --up
instead of -v
. We can think of a short option though...
To be discussed...
Thanks @hawartens for the info! And I like @degremont's idea very much. It would only require modification of clush, and we don't need to add a new group upcall. What do you think @hawartens, would that fit your needs?
A couple of points:
- I'm not fan of limiting this to clush. Sometimes you need nodeset/cluset to "delegate" the work (think milkcheck or shine; but I'm sure other sites have shell scripts that do similar works?) ; so this should be available to other tools and having this in nodeset makes more sense
- I think it's actually possible already with the source infrastructure we have, e.g. it isn't hard to make a "filter" source that just intersects or substract a noteset from whatever was given so e.g.
@up:foo
would take@foo
and filter only up nodes (or@up:source:group
if you use sources, nodeset cuts on the first colon so it would work appropriately I tested that quickly)
With that in mind I don't think adding a new option to clush would be worth it, I'm not fan of having too many options. A source has the drawback of being heavier syntax if you're going to list many nodeset or nodes but there might be some syntax trick to make @up:node[0-1000] or @up:group1,group2 to work, like escaping the comma? would need to play a bit more with that but it doesn't look impossible to me.
We definitely should ship an example of such a source if you come up with one, though, so others can also benefit more readily.
Thanks @hawartens for the info! And I like @degremont's idea very much. It would only require modification of clush, and we don't need to add a new group upcall. What do you think @hawartens, would that fit your needs?
@thiell I am good with any idea we have here that ends up making it very simple to avoid nodes that are down as the pdsh -v
option does. @degremont's idea sounds good to me and seems like it would fit my needs.
A couple of points:
- I'm not fan of limiting this to clush. Sometimes you need nodeset/cluset to "delegate" the work (think milkcheck or shine; but I'm sure other sites have shell scripts that do similar works?) ; so this should be available to other tools and having this in nodeset makes more sense
- I think it's actually possible already with the source infrastructure we have, e.g. it isn't hard to make a "filter" source that just intersects or substract a noteset from whatever was given so e.g.
@up:foo
would take@foo
and filter only up nodes (or@up:source:group
if you use sources, nodeset cuts on the first colon so it would work appropriately I tested that quickly)With that in mind I don't think adding a new option to clush would be worth it, I'm not fan of having too many options. A source has the drawback of being heavier syntax if you're going to list many nodeset or nodes but there might be some syntax trick to make @up:node[0-1000] or @up:group1,group2 to work, like escaping the comma? would need to play a bit more with that but it doesn't look impossible to me.
We definitely should ship an example of such a source if you come up with one, though, so others can also benefit more readily.
Not entirely sure I agree here. I prefer your initial assessment I agree the '&states:up' idiom is a bit cumbersome so having some shortcut would definitely be nice though.
Maybe some day we can add the shortcut and I can owe you a beer (or your preferred beverage)! :)
I think @thiell and I agreed on processing with a new configuration entry in clush.conf
and a command-line option to be added to the CLI for that.
- Add a new entry in main section of
clush.conf
nameddown_nodes
. Add it inClushConfig
, no need for a new property. Add this in documentation. - Add a new
--up
option inOptionParser.install_nodes_options()
. Add this in doc. - In
clush.conf
, the same way you did for nodown, test if the flag exists, and if true, resolve the content of thedown_nodes
and excludes these nodes from the computed list. - Add this in nodeset CLI too
- Add tests for that.
Don't hesitate to force-push your branch after doing these changes
I think @thiell and I agreed on processing with a new configuration entry in
clush.conf
and a command-line option to be added to the CLI for that.
- Add a new entry in main section of
clush.conf
nameddown_nodes
. Add it inClushConfig
, no need for a new property. Add this in documentation.- Add a new
--up
option inOptionParser.install_nodes_options()
. Add this in doc.- In
clush.conf
, the same way you did for nodown, test if the flag exists, and if true, resolve the content of thedown_nodes
and excludes these nodes from the computed list.- Add this in nodeset CLI too
- Add tests for that.
Don't hesitate to force-push your branch after doing these changes
Sorry I have not gotten to this yet, been focused on other things. Will get back to it soon.
- In
clush.conf
, the same way you did for nodown, test if the flag exists, and if true, resolve the content of thedown_nodes
and excludes these nodes from the computed list.
@degremont Finally looking at this again. Just want to make sure I really understand what you are asking for here. It sounds to me like instead of adding the genders
section to groups.conf, you would prefer that we just have a down_nodes
attribute in clush.conf. What do you expect down_nodes
to be set to? Would it still be something like:
down_nodes: whatsup -n -d || /bin/true
And other sites can define down_nodes something else (or unset by default). Not 100% sure, but I believe this also means that I still need to add resolver code for that into NodeSet.py
. Am I understanding this correctly? Just clarifying so I make sure that I understand what you guys are expecting. Thanks!
Actually, the idea is indeed to add a new entry in clush.conf
, but the content will have a nodeset syntax, like:
down_nodes: @whatsup:down
which could be anything else, like:
down_nodes: @slurm:offline
or any nodeset syntax
down_nodes: @slurm:offline,@foo:bar&@prod:all
That way, this will handle the nodeset resolution for you and you will not have nothing to do.
I put an example of what a whatsup
config could be.
You just need in the code something like (kind of):
down_nodes = NodeSet(config.get("down_nodes"))
@degremont Okay. Sounds good. Thanks for the clarification.