clustershell
clustershell copied to clipboard
Add doc for 'remote' and local execution
clush man page says using -W exec
will result in commands being executed locally.
However, when a topology is enabled, such command will be dispatched on gateways as much as possible (so, no so locally). remote
option control this behaviour. It should be documented.
Moreover, what should be the appropriate default value for this option?
Consider these 2 examples, used without topology.conf
:
-
mkdir /tmp/%h
-
ipmitool -I lan -H %h power status
They run fine.
If now I enabled a topology.conf
, with the proper configuration, gateways are used to reach the corresponding nodes, these 2 commands will be ran on gateways instead of current nodes.
This is not a problem for cmd #2, it is rather expected. But it will be problem for #1. If we change the default value, it will be the opposite. What should be the correct default value in your opinion?
I don't see a problem with your two examples, but I understand the possible issue with case 1. The administrator must know that if he does actually configure topology.conf, "local" commands will be executed on the gateways, so creating a local directory will be done locally on the gateways. But I don't think changing the way remote
is working is the best strategy. What is missing to (easily) solve case 1 - if the user wants to create local directories with target names - is a way to switch off the topology/tree mode. When testing, I'm either moving topology.conf to topology.conf-off, or using --topology=notree.conf
with a dummy file, and I'm doing that... very often. I'm quite sure now we should add an option to like --notree
or at least support --topology=off
. That would be very great.
What would be useful in some cases also, kind of the reverse: having a topology.conf ready but by default, clush wouldn't use it unless you specify --tree
. That would be very user-friendly IMHO. Maybe tree: on/off in clush.conf or defaults.conf?
Actually, there is already an option for that: --remote=yes|no
My question is: what should be the default value for this option with clush
?
Imagine the following scenario:
- Admin1 creates a script with
clush ... -R exec mkdir /tmp/%h
- Admin2 creates a script with
clush ... -R exec ipmitool -I lan -H %h power status
Then, Admin3 deploys a gateway node, and relocate all nodes being this gateway and configure clustershell accordingly. Now, depending on default value of --remote
, one of this script will be broken.
With the current default value, it will be Admin1's one.
According to the principle of least surprise, clush should do what user expects when doing very usual commands like clush -w foo[1-5] uptime
by example.
After looking more closely to the code, --remote
is also available for regular usage (with --worker=ssh
). When enabling topology, as user, I expect that commands will be automatically routed to gateways without further options... meaning --remote
default value should be set to true.
Does this mean remote should be true by default for distant commands and false for local ones?
Hmm, ok I understand better, and I think you're right. I just tested --worker exec
and it doesn't work by default, I need to add --remote=no
(my topology.conf is using 2 chained gateways). That may be something to patch in 1.7.1. I will do more tests tonight.
Still, a way to easily control how tree mode is enabled is something I would like to see, but that can wait 1.8+
btw there is not -W
short option for --worker
, it's -R
. Let me know if you want to add it as an alias.
Hmm, Not sure I understand better. I will have to do some tests. I pretty sure --remote=yes/no
was designed for distant workers only, and remote=no
switches to a local worker like ExecWorker on the gateways.
So what does --worker exec
in tree mode means? Should we still use ssh to connect to gateways and use ExecWorker, so that's like --remote=no
only ? Not sure. It sounds to me that tree mode should be disabled in that case. So installing a new topology.conf won't change your scripts.
However, if --worker some_distant_worker
is used, this could be the worker used for remote connections, and it could replace ssh. Probably.
One more word, for me, Admin2 should use:
clush ... --remote=no ipmitool -I lan -H %h power status
(installing a topology.conf will launch this command on gateways)
instead of:
clush ... -R exec ipmitool -I lan -H %h power status
(installing a topology.conf won't launch this command on gateways, but this needs a patch I guess ;)
What about doing this:
-
-R <worker>
does specify the worker, it can be either exec, ssh, tree or auto (default) -
--remote yes|no
only works with auto or tree and is yes by default
Behaviors:
-
-R exec
disables tree mode -
-R ssh
disables tree mode -
-R auto
(tree mode if topology.conf found - default) -
-R tree
enables tree mode (enforced)
With auto or tree workers, --remote
can be used with the following behaviors:
-
--remote=yes
commands are executed on target nodes if tree mode is enabled -
--remote=no
commands are executed on gateways if tree mode is enabled -
--remote=yes
uses default distant worker if tree mode is disabled -
--remote=no
uses default local worker if tree mode is disabled
But there will be no way to change local / distant worker with tree mode on command line for now.
But if the following is found in defaults.conf:
[task.default]
distant_workername: rsh
that will switch to rsh worker in tree mode.
updated comments
Do not rush on a patch for that. I need to think about it and all of this is not clear to me yet. Need to look at the code and do more tests...
Planned for 1.7.2, no problem. It's already usable as is. :)
I don't want to do major changes in 1.7.2, just bug fixes or doc improvements. @degremont: is it ok to change the milestone to 1.8? My change proposal above is still valid btw.
Getting back on this ticket, your summary looks good.
I was wondering about a different approach. Should we modify the distant_worker
with -R|--worker
option?
There is no way to control what is launched on gateways, without modifying defaults.conf
on these nodes.
If -R
means: change distant worker (meaning, change Task default distant_workername
, and this is propagated, this will be applied on Gateways. Meaning we can launch Exec worker there. This could be useful for other kind of worker, like Rsh, or WhateverSh.