flux-core support forwarding of X11 display from allocation

@ryanday36 reminded us about this use case in this week's project meeting.

Use case is e.g.: flux mini alloc -n1 xterm.

For reference, Slurm users have options described in this slurm FAQ entry.

Dec 03 '21 16:12 garlick

Duplicate of #2801.

Note: forwarding works currently in TOSS 3 for the reasons described in #2801, ssh forwarding on our clusters allows connections over the cluster-local network, and since DISPLAY is copied from the submission environment, it works almost by accident (as long as you don't log out of the ssh session providing the tunnel on the login node.)

In TOSS 4 (on fluke at least) sshd no long binds to a routable network port, and DISPLAY is set to localhost:<screen>.0 (sshd may bind only to the localhost address, or it may be using a unix domain socket, I didn't look into it in detail yet). Therefore, I don't think the solution above will work going forward. We may have to look into how to set up port forwarding or X11 tunneling back to login nodes for Flux jobs.

Feb 12 '22 17:02 grondo

I'm actually unsure this is something that needs support directly in Flux, at least at this early stage, but is more of a general site configuration issue.

There are two possible solutions:

Configure sshd on login nodes to bind the X11 listener to a cluster local address so DISPLAY exported to jobs "just works" (same configuration as TOSS 3)
If 1 is not an option, since passwordless ssh now works intra cluster, we could have a job prolog script that sets up ssh local port forwarding to connect back to the login node localhost X11 proxy, then adds appropriate cookie to Xauthority file with xauth.

Below is a working proof of concept that sets up the correct proxy and xauth. It can be run by hand for now, but the idea is something like this could be run from a job prolog with the job's user credentials (e.g. under runuser)

There's a lot of assumptions in this implementation (all of which are currently true on fluke at present):

HOSTNAME in the environment array submitted by the job is the hostname of the node from which the job was submitted. However, HOSTNAME is not in the list of frequently set or other environment variables listed in POSIX.1-2017. (HOSTNAME may be a bashism) We probably need a solution to #2875 to solve this problem in general.
The node from which the job is submitted has ssh X11 forwarding active from the user's X server
DISPLAY is set in the submission environment
DISPLAY is set to localhost or nothing (equivalent to localhost I think?). In the case where DISPLAY is set to a hostname or routable address, this prolog would likely not be required.
password-less ssh works from compute nodes to login nodes

There are probably some other caveats as well. Individual sites should evaluate this solution in their environment and possibly re-implement a site local solution.

This script, if used from a job prolog, should probably have a corresponding epilog script that kills off the background ssh process providing the proxy and removes the xauth cookie from the user's Xauthority file.

#!/bin/bash

job_getenv()
{   
    if test -z "$FLUX_JOB_ENV"; then
        FLUX_JOB_ENV=$(flux job info $FLUX_JOB_ID jobspec \
                       | jq .attributes.system.environment)
    fi
    echo $FLUX_JOB_ENV | jq -r .$1
}

host=$(job_getenv HOSTNAME)
DISPLAY=$(job_getenv DISPLAY)

if test -z "$host" -o -z "$DISPLAY"; then
    echo >&2 "HOSTNAME or DISPLAY not set in environment of job. Aborting.."
    exit 0
fi

displayhost=${DISPLAY%:*}
if ! test "$displayhost" = "localhost" -o -z "$displayhost"; then
    echo >&2 "DISPLAY hostname is not empty or localhost"
    exit 0
fi

display=${DISPLAY#*:}
port=$((${display%.*}+6000))

#  Forward local X11 port to login host
ssh -4 -fN -L ${port}:localhost:${port} ${host}

#  Add xauth from host
xauth add $DISPLAY . $(ssh $host xauth list $DISPLAY | awk '{ print $3 }')

# vi: ts=4 sw=4 expandtab

Feb 14 '22 15:02 grondo

I don't think that I ever hear back from our ISSOs about option 1. I'll bring that back up with them.

Feb 14 '22 16:02 ryanday36

I was looking at running this from the prolog on fluke. I run it as the user with sudo -u \#${FLUX_JOB_USERID} ... in the prolog script that is run by perilog-run. It appears to do what I expect, but the perilog-run process doesn't complete unless I kill the background ssh process, so my jobs never really start. Is there a way that I could run this script so that the prolog completes?

Jun 16 '22 20:06 ryanday36

Would nohup sudo -u \#${FLUX_JOB_USERID} ... get the job done?

Jun 16 '22 21:06 garlick

That doesn't fix it. To clarify, the background ssh process starts up and the script completes. So, the only thing still running is the background ssh process. The problem appears to be that the background ssh process is keeping the flux-imp run prolog process on the management node (rank 0) and, by extension, the `flux-perilog-run process that spawns it from completing.

Jun 16 '22 22:06 ryanday36

Only other thing I can think of is that the background ssh process is holding the stdout/err file descriptors open. Does running ssh with >/dev/null 2>&1 help? I had mistakenly thought that -f did this for us, but perhaps not. (nohup only seems to redirect stdout/err if the current file descriptors point to a terminal, which may be why that doesn't help here.)

Jun 20 '22 23:06 grondo

Yes! Redirecting stdout/err to /dev/null fixes it. Thanks @grondo.

Jun 21 '22 20:06 ryanday36