clustershell icon indicating copy to clipboard operation
clustershell copied to clipboard

environ of upcall commands

Open btwe opened this issue 1 year ago • 7 comments

My aim is to configure a groupsource which pulls the information from an ansible inventory configuration . The benefit would be, that I only have to manage the host-inventory and groups in ansible and clustershell is able to consume those as well.

According to the docs it should be possible to define a groupsource like:

#~/.config/clustershell/groups.conf
[myinv]
map=ansible-inventory -i myinv --graph $GROUP | perl -ne 'if (m/\|--([^@].*)/){print "$1 "}'
all=ansible-inventory -i myinv --graph all | perl -ne 'if (m/\|--([^@].*)/){print "$1 "}'
list=ansible-inventory -i myinv --graph all | perl -ne 'if (m/\|--@(.*):/){print "$1 "}'

This requires that the upcall commands are executed in the same environment as the cli-command. But clustershell executes those commands in a subprocess.Popen in which is sets the cwd=self.cfgdir link. ansible-inventory cannot find the inventory dir and fails.

The following patch works for me, but I do not know how deep this reaches and afflicts other use cases. What do you think:

diff --git a/lib/ClusterShell/NodeUtils.py b/lib/ClusterShell/NodeUtils.py
index f4a52f6..8fcafcd 100644
--- a/lib/ClusterShell/NodeUtils.py
+++ b/lib/ClusterShell/NodeUtils.py
@@ -202,7 +202,7 @@ class UpcallGroupSource(GroupSource):
         """
         cmdline = Template(self.upcalls[cmdtpl]).safe_substitute(args)
         self.logger.debug("EXEC '%s'", cmdline)
-        proc = Popen(cmdline, stdout=PIPE, shell=True, cwd=self.cfgdir,
+        proc = Popen(cmdline, stdout=PIPE, shell=True,
                      universal_newlines=True)
         output = proc.communicate()[0].strip()
         self.logger.debug("READ '%s'", output)

Using nodeset or cluset is painfully slow then, because those CLI commands tread all names as groupnames and try to resolve them in a loop, where each call to ansible-inventory is quite slow. But clush -bg GRPNAME works reasonable well, because there seems to be only one resolve call.

btwe avatar Dec 06 '23 11:12 btwe

[disclaimer] I'm the main developer of Cumin, sorry for the intrusion, I hope I'm not crossing a line here [/disclaimer]

@btwe another possibility (that requires some work though) could be to use wikimedia/cumin that is build on top of ClusterShell for the remote execution part but allows to query different backends for hosts selection, including custom ones. For your specific use case though you'll need to write your own custom backend as there isn't one for Ansible.

volans- avatar Dec 06 '23 12:12 volans-

Hi @btwe

I don't think we can change this default directory which is set on purpose, see https://github.com/cea-hpc/clustershell/blob/master/doc/man/man5/groups.conf.5#L140-L142

I'm not sure i understand the performance issue you are talking about. Could you give me example of what command is slow and how long? clush and nodeset use the same code.

degremont avatar Dec 07 '23 08:12 degremont

Also where is located myinv inventory file, which directory?

degremont avatar Dec 07 '23 08:12 degremont

Hi @degremont

many thanks for your reply.

I don't think we can change this default directory which is set on purpose, see https://github.com/cea-hpc/clustershell/blob/master/doc/man/man5/groups.conf.5#L140-L142

Yes, I assumed this would break other setups. But I think I can work around this. It seems to be possible to place some scripts in CFGDIR which can do all the magic I need for the upcalls.

I'm not sure i understand the performance issue you are talking about. Could you give me example of what command is slow and how long? clush and nodeset use the same code.

The performance issue is not in clustershell. The call to ansible-inventory is very time consuming with a walltime of ~1s. nodeset -l GROUP first executes the upcall list and then for each item of the result it executes the upcall map. It works as designed, but the iteration count of the loop is about 20. OTOH, clush -g GROUP executes only the upcall map once to get related nodeset.

Also where is located myinv inventory file, which directory?

This would be available in the cwd I am currently working in.

I think this issue can then be closed.

@volans- thanks for pointing out cumin.

btwe avatar Dec 07 '23 09:12 btwe

nodeset -l GROUP first executes the upcall list and then for each item of the result it executes the upcall map

this is not the right way to do this. What you seems to want is:

nodeset -f @GROUP which will be as fast as clush -g GROUP (which is the same as clush -w @GROUP )

degremont avatar Dec 07 '23 10:12 degremont

nodeset -f @GROUP which will be as fast as clush -g GROUP (which is the same as clush -w @GROUP )

Perfect, -f,--fold or -e,--expand also only resolve the requested group. Many thanks!

btwe avatar Dec 07 '23 10:12 btwe

The situation is clear to me. Thanks for all the support. Fmpov, you can close this.

btwe avatar Apr 10 '24 10:04 btwe