flux-sched icon indicating copy to clipboard operation
flux-sched copied to clipboard

`t4004-match-hwloc.t` fails if flux curve keys do not exist

Open SteVwonder opened this issue 5 years ago • 5 comments

❯ make check  
<snip>
PASS: t4003-cancel-info.t 7 - removing resource works
ERROR: t4004-match-hwloc.t - missing test plan
ERROR: t4004-match-hwloc.t - exited with status 137 (terminated by signal 9?)
PASS: t4005-match-unsat.t 1 - loading resource module with a tiny machine config works
<snip>
❯ ./t4004-match-hwloc.t
flux-broker: zsecurity_comms_init: The directory '/home/sherbein/.flux' does not exist.  Have you run "flux keygen"?
flux-broker: overlay_bind failed: No such file or directory
flux-broker: bootstrap failed
flux-broker: zsecurity_comms_init: The directory '/home/sherbein/.flux' does not exist.  Have you run "flux keygen"?
flux-broker: overlay_bind failed: No such file or directory
flux-broker: bootstrap failed
flux-start: 0 (pid 920235) exited with rc=1
flux-start: 1 (pid 920236) exited with rc=1
flux-start: 2 (pid 920237) Killed
flux-start: 3 (pid 920238) Killed

I see three options (please suggest more if you have them):

  1. Have the sharness script in flux-sched check for the keys and if they do not exist auto-generate them:
diff --git a/t/sharness.d/sched-sharness.sh b/t/sharness.d/sched-sharness.sh
index 29ae36f1..2608320e 100644
--- a/t/sharness.d/sched-sharness.sh
+++ b/t/sharness.d/sched-sharness.sh
@@ -21,6 +21,7 @@ fi

 ## Set up environment using flux(1) in PATH
 flux --help >/dev/null 2>&1 || error "Failed to find flux in PATH"
+[[ -f $HOME/.flux/curve/client ]] || flux keygen
 eval $(flux env)
  1. Have the sharness script in flux-sched check for the keys and if they do not exist skip tests that require them
  2. Leave flux-sched as it is and just update our quickstart guide on readthedocs to require running flux keygen before building flux-sched.

My preference is 1, but I'm not sure if there are any potential "gotchas" with doing that (@garlick?).

Thoughts?

SteVwonder avatar May 09 '20 02:05 SteVwonder

What does flux-core do in this case? If it handles this sufficiently well, we should use the same trick. Otherwise, this should be fixed at both places..

dongahn avatar May 15 '20 22:05 dongahn

I guess this is still an issue. Wouldn't it be better to be handled by sharness.d/flux-sharness.sh so that this can be solved for both flux-core and fluxion?

dongahn avatar Jul 28 '20 23:07 dongahn

Looks like flux-core runs flux-keygen during make and saves some keys in the source tree for use while testing:

> make -j
<snip>
make[1]: Entering directory '/usr/src/etc'
  GEN      flux/.nodocs
  GEN      flux/curve
  GEN      flux/help.d/core.json
Saving /usr/src/etc/flux/curve/client
Saving /usr/src/etc/flux/curve/client_private
Saving /usr/src/etc/flux/curve/server
Saving /usr/src/etc/flux/curve/server_private

That was in a container, so /usr/src was the git repo/source tree.

I suspect we could do the same in flux-sched, and stick the equivalent of flux keygen --secdir=$BUILDDIR/etc/flux/curve in the etc/Makefile.am, and then set FLUX_SEC_DIRECTORY to that build directory in the sharness. (I'm not sure how flux-core gets away with not setting the environment variable).

SteVwonder avatar Jul 29 '20 05:07 SteVwonder

I'm not sure how flux-core gets away with not setting the environment variable)

There are a number of compiled-in paths that are altered if flux detects that it is running inside the flux-core source tree. The key dir is one of those. So we cheat - sorry!

It doesn't help now, but there is flux-framework/flux-core#2767 which would eliminate the need for users to have keys. I think we may want to bump this up in priority for our TOSS 4 deliverable since reading keys out of NFS directories is generally frowned upon in LC.

In the mean time do those tests really need to start 4 brokers? The resource set is being provided as test input. Keys are only required if broker to broker connections need to be established. Edit: so changing test_under_flux 4 to test_under_flux 1 in the two hwloc tests would be another workaround.

garlick avatar Jul 29 '20 13:07 garlick

So we cheat - sorry!

😆

It doesn't help now, but there is flux-framework/flux-core#2767 which would eliminate the need for users to have keys.

If flux-framework/flux-core#2767 is going to solve this eventually anyway, I'd be happy to wait for a PR on that to land and by proxy handle this issue as well.

In the mean time do those tests really need to start 4 brokers?

Maybe there is a way around it, but I think 4 ranks are required so that we can run tests with 4 "nodes" worth of hwloc data (from 4 separate hwloc xml files, 1 per rank).

SteVwonder avatar Jul 29 '20 16:07 SteVwonder