jupyter-server-proxy icon indicating copy to clipboard operation
jupyter-server-proxy copied to clipboard

Option to allow starting-up again

Open dpwrussell opened this issue 3 years ago • 7 comments

Proposed change

Allow a way to stop, and later start the proxied service again.

At the moment, if the managed process exits with a non-0 exit status, then it will immediately start back up. If the managed process exits with a 0 exit status then the service is not restarted, and if the Jupyter menu item is clicked again, a 500 server error is shown instead of starting the process.

In order to free up resources it would be good to be able to stop the proxied service, then at some later time, decide to start it up again by clicking on the menu item in Jupyter.

Alternative options

Who would use this feature?

If the proxied service is expensive (in terms of resources), then it makes sense to be able to stop it (freeing resources), then later start it again.

(Optional): Suggest a solution

I can think of two reasonable ways to implement this.

  1. A configuration option which causes a 0 exit code to exit cleanly as today, but start a new supervised process if triggered from Jupyter.
  2. An exit status which causes this "restartable" mode, where a new supervised process can be triggered from Jupyter. I'm less keen on this because I don't think there is a status code with those semantics and it might break existing behaviour.
  3. Some hybrid of the two, possibly implemented as some kind of customizable on_exit hook that one could supply.

dpwrussell avatar Sep 08 '20 08:09 dpwrussell

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively. welcome You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

welcome[bot] avatar Sep 08 '20 08:09 welcome[bot]

(1) sounds like a good idea. Processes are managed by the simpervisor library, but I think it can be implemented without touching that.

Alternatively does it need to be configurable, is there a situation where you want to prevent someone manually restarting a stopped process? Anyone else have thoughts?

manics avatar Sep 08 '20 09:09 manics

Hi @manics

I can't really imagine why you would want to ensure that it only runs once, but I guess you never know what people have used this for.

My suspicion (from a brief glance at the code) is that by removing a lock variable on graceful exit it might just work. I'm just trying to confirm that now.

dpwrussell avatar Sep 08 '20 10:09 dpwrussell

Also, looks like this has been discussed before (https://discourse.jupyter.org/t/lifecycle-for-servers-started-by-jupyter-server-proxy/314) and that it looks like there is a FIXME to implement exactly this.

dpwrussell avatar Sep 08 '20 11:09 dpwrussell

I've had some difficulty getting a development environment for jupyter-server-proxy in order to experiment efficiently with this. If anyone could help with that, then it would be appreciated. I posted here: https://discourse.jupyter.org/t/jupyter-server-proxy-development-environment/5906

dpwrussell avatar Sep 08 '20 15:09 dpwrussell

An option like: auto_restart=(bool) would be vary nice for my use case.

We are using the proxy to launch a service on a HPC that has limited GPUs. Without a fix the users would easily use up all the ~30 GPUs.

Our workaround is to start a webservice before the real service. This prevents the loop.

#!/bin/bash
# start a simple webserver on the proxy port
python <<EOF
import SimpleHTTPServer
# start on the proxy port and show a webpage that has a button.
# webserver waits until user clicks html button then sys.exit(0)
EOF

# short sleep to ensure socket is free.
sleep 0.1s

# start the real service.
out=$(start_service_gpu.sh $WWW_PORT 2>&1)

# (optional) we start another webserver to show STDOUT
timeout 5s python <<EOF
import SimpleHTTPServer
EOF

Yes it's a hack but it does work well and provides some extra functionality. However now i have these bash scripts that never end and it's not good for my metric tracking.

I did look at the code expecting an easy win but well I got a little lost in the async stuff.

steverweber avatar Oct 10 '20 20:10 steverweber

O silly me after reviewing the code and well somewhat getting it figured out I realized that "Add configuration option to allow restart on graceful exit #215" is a PR. well done @dpwrussell ... now to get it merged.

steverweber avatar Oct 10 '20 20:10 steverweber