clustershell icon indicating copy to clipboard operation
clustershell copied to clipboard

Add --nodown Option to Exclude Nodes That Are Down

Open hawartens opened this issue 4 years ago • 14 comments

Add in an option that is similar to the -v option in pdsh. The idea is to skip all hosts that are currently down for one reason or another. The groups.conf file would have something like this in it:

[genders]
map: nodeattr -n $GROUP
all: nodeattr -n -A
list: nodeattr -l
down: whatsup -n -d || /bin/true

Example usage without -D:

> clush -a -b /path/to/command.sh
host8: mcmd: connect failed: No route to host
host12: mcmd: connect failed: No route to host
clush: host[8,12] (2): exited with exit code 1
---------------
host[1-7,9-11] (10)
---------------
Hello World

Example usage with -D:
> clush -D -a -b /path/to/command.sh
---------------
host[1-7,9-11] (10)
---------------
Hello World

hawartens avatar Sep 16 '20 06:09 hawartens

Looks like I broke a test. Will fix. Thanks.

hawartens avatar Sep 16 '20 13:09 hawartens

Hello. Would like to start a conversation about this pull request. Would like to give clush the ability as described here to skip nodes that are "down." On large clusters, there may be several nodes that are down regularly and would prefer not to send commands to them at all. This pull request is a start at that. Would love your feedback! Thanks!

hawartens avatar Sep 22 '20 10:09 hawartens

Thanks for this PR. Few questions to understand this better

  • is whatsup an independent tool or is it related to genders? Any pointer?
  • how would you apply that to Slurm group source by example?
  • Do you have use cases where you want to run your command on all hosts (even those which you know are down)?

My experience running supercomputer indicates that the source giving the group of nodes is often a totally different one to the one giving the status of nodes. ie:

roles:
    compute: node[1-3]
    login: node[4-5]
[states]
   map: nodehealth -f $GROUP

where you usually combine them with: clush -w @roles:compute&@states:up or clush -w @roles:compute&@states:down. That's why I'm wondering if having such default rules, per source is the right one.

There are only two hard things in Computer Science: cache invalidation and naming things.

I'm looking toward naming the option --up. I prefer positive flags instead of double negation ("no" "down")

degremont avatar Sep 22 '20 10:09 degremont

Thanks for this PR. Few questions to understand this better

  • is whatsup an independent tool or is it related to genders? Any pointer?

whatsup is a tool that calculates what hosts are up/down in a cluster https://github.com/chaos/whatsup

  • how would you apply that to Slurm group source by example?

Not sure there is a need. The "down" is optional. I am not a slurm expert, but you might do it like: down: sinfo -h --dead -o "%N" The note thing is that each site can define "down" for themselves.

  • Do you have use cases where you want to run your command on all hosts (even those which you know are down)?

I think there may be times where we want to run the command everywhere and visibly see that we got an error connecting to that host. Can't think of a great reason at the moment, but probably just tired.

My experience running supercomputer indicates that the source giving the group of nodes is often a totally different one to the one giving the status of nodes. ie:

roles:
    compute: node[1-3]
    login: node[4-5]
[states]
   map: nodehealth -f $GROUP

where you usually combine them with: clush -w @roles:compute&@states:up or clush -w @roles:compute&@states:down. That's why I'm wondering if having such default rules, per source is the right one.

There are only two hard things in Computer Science: cache invalidation and naming things.

I'm looking toward naming the option --up. I prefer positive flags instead of double negation ("no" "down")

Totally fine with changing the name/polarity of the option if you prefer. I was just mapping this to the -v option in pdsh (and was trying to give it a similar feel). From pdsh manpage:

nodeupdown module options
       -v     Eliminate target nodes that are considered "down" by libnodeupdown.

hawartens avatar Sep 22 '20 11:09 hawartens

I feel like we should go toward something like

  • Having a proper source for up/down (or other states)
[whatsup]
list: echo -e "up\ndown"
map: if [ $GROUP == "up" ]; then whatsup --up; elif [ $GROUP == "down" ]; then whatsup --down; fi; true

Then find a proper way to declare the upcall, maybe in clush.conf. Easier to do, but not available to library? Like:

downnodes: @whatsup:down

And add to clush

--up:   excludes known "down" nodes using `downnodes` callback.

I was just mapping this to the -v option in pdsh (and was trying to give it a similar feel). From pdsh manpage:

In my thinking, running on upnodes was the same than skipping down nodes. So the same feel than what you're doing with pdsh, just using --up instead of -v. We can think of a short option though...

To be discussed...

degremont avatar Sep 22 '20 11:09 degremont

Thanks @hawartens for the info! And I like @degremont's idea very much. It would only require modification of clush, and we don't need to add a new group upcall. What do you think @hawartens, would that fit your needs?

thiell avatar Sep 22 '20 17:09 thiell

A couple of points:

  • I'm not fan of limiting this to clush. Sometimes you need nodeset/cluset to "delegate" the work (think milkcheck or shine; but I'm sure other sites have shell scripts that do similar works?) ; so this should be available to other tools and having this in nodeset makes more sense
  • I think it's actually possible already with the source infrastructure we have, e.g. it isn't hard to make a "filter" source that just intersects or substract a noteset from whatever was given so e.g. @up:foo would take @foo and filter only up nodes (or @up:source:group if you use sources, nodeset cuts on the first colon so it would work appropriately I tested that quickly)

With that in mind I don't think adding a new option to clush would be worth it, I'm not fan of having too many options. A source has the drawback of being heavier syntax if you're going to list many nodeset or nodes but there might be some syntax trick to make @up:node[0-1000] or @up:group1,group2 to work, like escaping the comma? would need to play a bit more with that but it doesn't look impossible to me.

We definitely should ship an example of such a source if you come up with one, though, so others can also benefit more readily.

martinetd avatar Sep 22 '20 18:09 martinetd

Thanks @hawartens for the info! And I like @degremont's idea very much. It would only require modification of clush, and we don't need to add a new group upcall. What do you think @hawartens, would that fit your needs?

@thiell I am good with any idea we have here that ends up making it very simple to avoid nodes that are down as the pdsh -v option does. @degremont's idea sounds good to me and seems like it would fit my needs.

hawartens avatar Sep 23 '20 03:09 hawartens

A couple of points:

  • I'm not fan of limiting this to clush. Sometimes you need nodeset/cluset to "delegate" the work (think milkcheck or shine; but I'm sure other sites have shell scripts that do similar works?) ; so this should be available to other tools and having this in nodeset makes more sense
  • I think it's actually possible already with the source infrastructure we have, e.g. it isn't hard to make a "filter" source that just intersects or substract a noteset from whatever was given so e.g. @up:foo would take @foo and filter only up nodes (or @up:source:group if you use sources, nodeset cuts on the first colon so it would work appropriately I tested that quickly)

With that in mind I don't think adding a new option to clush would be worth it, I'm not fan of having too many options. A source has the drawback of being heavier syntax if you're going to list many nodeset or nodes but there might be some syntax trick to make @up:node[0-1000] or @up:group1,group2 to work, like escaping the comma? would need to play a bit more with that but it doesn't look impossible to me.

We definitely should ship an example of such a source if you come up with one, though, so others can also benefit more readily.

Not entirely sure I agree here. I prefer your initial assessment I agree the '&states:up' idiom is a bit cumbersome so having some shortcut would definitely be nice though. Maybe some day we can add the shortcut and I can owe you a beer (or your preferred beverage)! :)

hawartens avatar Sep 23 '20 03:09 hawartens

I think @thiell and I agreed on processing with a new configuration entry in clush.conf and a command-line option to be added to the CLI for that.

  1. Add a new entry in main section of clush.conf named down_nodes. Add it in ClushConfig, no need for a new property. Add this in documentation.
  2. Add a new --up option in OptionParser.install_nodes_options(). Add this in doc.
  3. In clush.conf, the same way you did for nodown, test if the flag exists, and if true, resolve the content of the down_nodes and excludes these nodes from the computed list.
  4. Add this in nodeset CLI too
  5. Add tests for that.

Don't hesitate to force-push your branch after doing these changes

degremont avatar Sep 23 '20 11:09 degremont

I think @thiell and I agreed on processing with a new configuration entry in clush.conf and a command-line option to be added to the CLI for that.

  1. Add a new entry in main section of clush.conf named down_nodes. Add it in ClushConfig, no need for a new property. Add this in documentation.
  2. Add a new --up option in OptionParser.install_nodes_options(). Add this in doc.
  3. In clush.conf, the same way you did for nodown, test if the flag exists, and if true, resolve the content of the down_nodes and excludes these nodes from the computed list.
  4. Add this in nodeset CLI too
  5. Add tests for that.

Don't hesitate to force-push your branch after doing these changes

Sorry I have not gotten to this yet, been focused on other things. Will get back to it soon.

hawartens avatar Oct 03 '20 17:10 hawartens

  1. In clush.conf, the same way you did for nodown, test if the flag exists, and if true, resolve the content of the down_nodes and excludes these nodes from the computed list.

@degremont Finally looking at this again. Just want to make sure I really understand what you are asking for here. It sounds to me like instead of adding the genders section to groups.conf, you would prefer that we just have a down_nodes attribute in clush.conf. What do you expect down_nodes to be set to? Would it still be something like:

down_nodes: whatsup -n -d || /bin/true

And other sites can define down_nodes something else (or unset by default). Not 100% sure, but I believe this also means that I still need to add resolver code for that into NodeSet.py. Am I understanding this correctly? Just clarifying so I make sure that I understand what you guys are expecting. Thanks!

hawartens avatar Oct 14 '20 20:10 hawartens

Actually, the idea is indeed to add a new entry in clush.conf, but the content will have a nodeset syntax, like:

down_nodes: @whatsup:down

which could be anything else, like:

down_nodes: @slurm:offline

or any nodeset syntax

down_nodes: @slurm:offline,@foo:bar&@prod:all

That way, this will handle the nodeset resolution for you and you will not have nothing to do. I put an example of what a whatsup config could be.

You just need in the code something like (kind of):

down_nodes = NodeSet(config.get("down_nodes"))

degremont avatar Oct 15 '20 18:10 degremont

@degremont Okay. Sounds good. Thanks for the clarification.

hawartens avatar Oct 15 '20 20:10 hawartens