flux-core
flux-core copied to clipboard
System-default Flux does not bootstrap with `jsrun`
Older versions of Flux work properly though. See below.
[corbett8@lassen27:~]$ lrun -n 2 flux start echo foobar
foobar
foobar
[corbett8@lassen27:~]$ lrun -n 2 /usr/global/tools/flux/blueos_3_ppc64le_ib/flux-c0.29.0-s0.18.0/bin/flux start echo foobar
foobar
[corbett8@lassen27:~]$ FLUX_PMI_DEBUG=1 lrun -n 2 flux start -vvv echo foobar
flux-start: /usr/libexec/flux/cmd/flux-broker echo foobar
flux-start: /usr/libexec/flux/cmd/flux-broker echo foobar
pmi-debug-singleton[-1]: init = operation completed successfully
pmi-debug-singleton[0]: get_params (rank=0 size=1 kvsname=singleton) = operation completed successfully
pmi-debug-singleton[0]: kvs_get (kvsname=singleton key=flux.instance-level value=<none>) = operation failed
pmi-debug-singleton[0]: finalize = operation completed successfully
pmi-debug-singleton[-1]: init = operation completed successfully
pmi-debug-singleton[0]: get_params (rank=0 size=1 kvsname=singleton) = operation completed successfully
pmi-debug-singleton[0]: kvs_get (kvsname=singleton key=flux.instance-level value=<none>) = operation failed
pmi-debug-singleton[0]: finalize = operation completed successfully
foobar
foobar
[corbett8@lassen27:~]$ FLUX_PMI_DEBUG=1 lrun -n 2 /usr/global/tools/flux/blueos_3_ppc64le_ib/flux-c0.29.0-s0.18.0/bin/flux start echo foobar
pmi-debug-pmix[-1]: init = operation completed successfully
pmi-debug-pmix[1]: get_params (rank=1 size=2 kvsname=11) = operation completed successfully
pmi-debug-pmix[1]: kvs_get (kvsname=11 key=flux.instance-level value=<none>) = operation failed
pmi-debug-pmix[1]: kvs_get (kvsname=11 key=PMI_process_mapping value=<none>) = operation failed
pmi-debug-pmix[1]: kvs_put (kvsname=11 key=1 value=6R=Bc}>P]7l&t+{^Ggy5wWZzzq&p?-P#O<<J^-G5) = operation completed successfully
pmi-debug-pmix[1]: kvs_commit (kvsname=11) = operation completed successfully
pmi-debug-pmix[-1]: init = operation completed successfully
pmi-debug-pmix[0]: get_params (rank=0 size=2 kvsname=11) = operation completed successfully
pmi-debug-pmix[0]: kvs_get (kvsname=11 key=flux.instance-level value=<none>) = operation failed
pmi-debug-pmix[0]: kvs_get (kvsname=11 key=PMI_process_mapping value=<none>) = operation failed
pmi-debug-pmix[0]: kvs_put (kvsname=11 key=0 value==!1CIlZ)=U1^>pI3ZdBUCvTr8mr7TIe<!>>@xwny,tcp://[::ffff:192.168.128.26]:49152) = operation completed successfully
pmi-debug-pmix[0]: kvs_commit (kvsname=11) = operation completed successfully
pmi-debug-pmix[1]: barrier = operation completed successfully
pmi-debug-pmix[1]: kvs_get (kvsname=11 key=0 value==!1CIlZ)=U1^>pI3ZdBUCvTr8mr7TIe<!>>@xwny,tcp://[::ffff:192.168.128.26]:49152) = operation completed successfully
pmi-debug-pmix[0]: barrier = operation completed successfully
pmi-debug-pmix[0]: kvs_get (kvsname=11 key=1 value=6R=Bc}>P]7l&t+{^Ggy5wWZzzq&p?-P#O<<J^-G5) = operation completed successfully
pmi-debug-pmix[1]: barrier = operation completed successfully
pmi-debug-pmix[0]: barrier = operation completed successfully
pmi-debug-pmix[1]: finalize = operation completed successfully
pmi-debug-pmix[0]: finalize = operation completed successfully
foobar
For completeness the failing flux is flux-core 0.38.0
[garlick@lassen709:~]$ which flux
/usr/bin/flux
[garlick@lassen709:~]$ flux version
commands: 0.38.0
libflux-core: 0.38.0
build-options: +hwloc==1.11.0+zmq==4.1.5
Possibly it's because it wasn't configured --with-pmix-bootstrap, whereas
[garlick@lassen709:flux-core]$ /usr/global/tools/flux/blueos_3_ppc64le_ib/flux-c0.29.0-s0.18.0/bin/flux version
commands: 0.29.0
libflux-core: 0.29.0
build-options: +pmix-bootstrap==3.1.4+hwloc==1.11.6+zmq==4.1.5
Well, oof, that makes sense. Seems pretty obvious, I just didn't realize that was a configuration option.
an obscure one to be sure. We'll want to be sure to add that to the rpm spec file, conditional on the architecture.
Just a couple of notes since I've not really been on that system much.
There are three rpm packaged versions of pmix, all side installed under /usr/pmix with no pkgconfig file:
pmix125-1.2.5-9.1.ch6.ppc64le
pmix312-3.1.2-9.1.ch6.ppc64le
pmix214-2.1.4-13.1.ch6.ppc64le
FWIW, here is a script I just used to configure flux core against pmix312:
#!/bin/bash
OPENPMIX_VERSION=3.1.2
OPENPMIX_PREFIX=/usr/pmix/${OPENPMIX_VERSION}
# Usage make_pc prefix version
# Prints a temp directory containing pmix.pc file
make_pc() {
local dir=$(mktemp -d)
cat >${dir}/pmix.pc <<-EOT
prefix=$1
exec_prefix=\${prefix}
libdir=\${exec_prefix}/lib64
includedir=\${prefix}/include
Name: pmix
Description: Process Management Interface for Exascale (PMIx)
Version: $2
URL: https://pmix.org/
Requires: hwloc libevent zlib
Libs: -L\${libdir} -lpmix
Cflags: -I\${includedir}
EOT
echo $dir
}
PKG_CONFIG_PATH=$(pkg-config --variable pc_path pkg-config)
# add pmix (it ships w/o pmix.pc)
tempdir=$(make_pc ${OPENPMIX_PREFIX} ${OPENPMIX_VERSION})
PKG_CONFIG_PATH=${tempdir}:${PKG_CONFIG_PATH}
PATH=${PATH} \
PKG_CONFIG_PATH=${PKG_CONFIG_PATH} \
./configure --enable-pmix-bootstrap
rm ${tempdir}/pmix.pc
rmdir ${tempdir}
Now to figure out how to test it. (lrun seems to only work within an LSF allocation?)
Yeah lrun is only within an allocation, and jsrun too (which lrun wraps).
Is this supposed to work? It just hangs for me
$ lalloc 2
[snip]
Job <3746120> is submitted to default queue <pbatch>.
<<Waiting for dispatch ...>>
<<Starting on lassen710>>
<<Waiting for JSM to become ready ...>>
<<Redirecting to compute node lassen743, setting up as private launch node>>
$ lrun -n2 hostname
Yeah it’s supposed to work, that’s weird…
I wonder if this also explains the problem with -Srundir=/path/to/shared/fs on that system, since each broker thinks it is rank 0, each one tries to create the same socket in rundir. (Also, do users want to use statedir in recent versions of Flux, not rundir?)
FYI - I've built flux-core-0.41.0-2 RPMs for TOSS 3 with --enable-pmix-bootstrap on ppc64le
I wonder if this also explains the problem with
-Srundir=/path/to/shared/fson that system, since each broker thinks it is rank 0, each one tries to create the same socket in rundir.
Yeah, that makes sense.
(Also, do users want to use statedir in recent versions of Flux, not rundir?)
Yep. Just double checked our FAQ and that's what it says.
This is what I've been using to run Flux on Lassen with jsrun:
#!/bin/bash
#BSUB -q pdebug
#BSUB -W 120
#BSUB -nnodes 10
#BSUB -J fluxsesh
module use /usr/tce/modulefiles/Core
module use /usr/global/tools/flux/blueos_3_ppc64le_ib/modulefiles
module load pmi-shim
PMIX_MCA_gds="^ds12,ds21" jsrun -a 1 -c ALL_CPUS -g ALL_GPUS -n 10 --bind=none --smpiargs="-disable_gpu_hooks" flux start sleep inf