Mark Grondona
Mark Grondona
Possibly related? #3927
Thanks @jameshcorbett! One idea proposed by @garlick was to have the prolog exit with a special nonzero exit code. The perilog.so plugin could then collect the ranks on which this...
D'oh, I misunderstood the use case, sorry! :facepalm: This will be slightly less complicated to support -- the determination of which nodes "failed" could be contained in the coral2 jobtap...
While we determine a way to do this generally in Flux, perhaps a kind of proof of concept or stopgap solution fully implemented in flux-coral2 could work for https://github.com/flux-framework/flux-coral2/issues/364: 1....
Good guess, I also bet on an hwloc issue. I wonder if putting the new `LD_LIBRARY_PATH` on the `flux alloc` command line would resolve the issue for now: ``` $...
Actually, scratch that. It will probably have the same problem since the environment will be set by the job shell before the broker's are invoked. Edit: And just FYI, I...
> I moved torch's libnuma .so out of the way and flux is fine. Nice catch! libnuma is part of numactl which is probably one of hwloc's dependencies (unless it...
Yes, that is correct `LD_LIBRARY_PATH` is ignored by for setuid/setgid binaries: From [ld.so(8)](https://linux.die.net/man/8/ld.so) > Using the environment variable LD_LIBRARY_PATH. Except if the executable is a set-user-ID/set-group-ID binary, in which case...
Ah, I see. I misunderstood the intent of that comment, sorry!
> However I don't know how to get it to ignore the parse_jobspec: job ƒ2kTi2FyZ invalid jobspec; Unsupported resource type 'ssd' Is this coming from the `job-list` module? If so...