batchspawner icon indicating copy to clipboard operation
batchspawner copied to clipboard

Add 'cancel spawn' functionality

Open dr-br opened this issue 3 years ago • 6 comments

Proposed change

I would like to give our HPC cluster users the ability to cancel the spawning process. If they have selected the wrong resources, they may find themselves in the spawning state without the possibility to cancel it and to spawn another job with differently chosen resources.

Alternative options

The only means to cancel a pending job is ssh'ing into the cluster and do an scancel <jobid> (SLURM)

Who would use this feature?

All our HPC users.

(Optional): Suggest a solution

A button with "cancel spawn" functionality. It simply does a scancel <jobid> (SLURM) or comparable commands for Torque etc.

dr-br avatar Oct 23 '20 08:10 dr-br

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively. welcome You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

welcome[bot] avatar Oct 23 '20 08:10 welcome[bot]

This is a nice idea, and now that you mention it, it would be useful.

The problem is that the spawn process and the page presented to the user is controlled by the hub, so somehow JupyterHub would have to be adjusted to have these options, and then batchspawner could use it.

Other relevant issues I can find:

  • https://github.com/jupyterhub/jupyterhub/issues/2975 - one can't stop a pending server. This would presumably need to be solved first.

So, I propose we transfer this to the JupyterHub repository. Any other comments about this? (Perhaps we can discuss at our monthly meeting)

rkdarst avatar Oct 23 '20 13:10 rkdarst

So, I propose we transfer this to the JupyterHub repository. Any other comments about this? (Perhaps we can discuss at our monthly meeting)

If you could initiate that: Could you please take over? Thanks!

dr-br avatar Oct 26 '20 07:10 dr-br

Did this feature request get created on the JupyterHub repo? I can't seem to find it.

hakasapl avatar Mar 28 '22 14:03 hakasapl

So what are the alternative at the moment?

  • letting the jobs run forever (sigh) potentially filling the cluster if you have more users than nodes
  • putting a wallclock limit and let slurm kill the job, with a poor experience for the users who get killed while doing their jobs
  • enabling culling? Any downsides about the latter?

davidedelvento avatar Jul 29 '22 13:07 davidedelvento

This is the relevant JupyterHub issue, deleting is effectively the same as cancelling: https://github.com/jupyterhub/jupyterhub/issues/2975

manics avatar Dec 10 '22 23:12 manics