wlm-operator icon indicating copy to clipboard operation
wlm-operator copied to clipboard

Use slurm C language API instead of calling binaries

Open pisarukv opened this issue 5 years ago • 5 comments

First all we will need to create go binding for slurm c lib

pisarukv avatar May 02 '19 12:05 pisarukv

I was able to generate Go binding, see sashayakovtseva/go-slurm, but returned info is always empty. Attempts to disable munge auth, run C code directly led to no success. Switching tasks.

sashayakovtseva avatar May 20 '19 10:05 sashayakovtseva

@sashayakovtseva @cali4888 this may seem like a roundabout solution, but would providing a fuse-plugin that polls slurm on demand - like a webservice - to provide the slurm data through a file interface be unreasonable?

For clarity, an ls operation on the 'slurm machine directory' kicks off a query using the slurm C library for number of nodes, each node gets a file in the directory, a cat or file read on a machine file prints out the slurm data for that node). This would be similar to how linux shows data about a system in /proc or /sys.

ct-clmsn avatar Jun 20 '19 01:06 ct-clmsn

Not sure, that I am fully understand all benefits of such plugin. Can you give us a little bit more details why it could be useful in operator?

pisarukv avatar Jun 20 '19 13:06 pisarukv

@cali4888 apologies for the delay responding. cgo code might get a little ugly. Having a /proc or /sys directory dynamically update with slurm information would mean queries to slurm, from go, would be file reads (and the files could be json or yaml formatted). The fuse-plugin would be in C and your slurm-client code would reside in that plugin.

overhead probably isn't something ya'll are concerned with but, the memory management aspect of cgo integration might be worth considering: https://www.cockroachlabs.com/blog/the-cost-and-complexity-of-cgo/

ct-clmsn avatar Jul 02 '19 15:07 ct-clmsn

@ct-clmsn thanks for clarifications :) Yeah, agree with you, i'm also not a big fan of cgo. Such plugin could be a good solution at this place. Actually right now we are thinking how we can make operator more generic. We are actively looking into pmix as a possible solution. So that's why we are not rushing with migrating from direct calls to binaries.

pisarukv avatar Jul 03 '19 10:07 pisarukv