fargatespawner
fargatespawner copied to clipboard
Persisting User Input
Hey!
Will this spawner persist the user input inside the notbook? Or is there a way to configure this?
Thanks!
Hi @Nop0x
Depends on what you mean exactly...
The values of the variables in the memory of notebooks are not persisted between the servers stopping and restarting: this I believe is how all notebook servers behave, and AFAIK there is no way for this to be different.
The notebooks themselves are not persisted between stop and start if stored locally on the filesystem, and in a way this is unavoidable: Fargate does not support persistent volumes. However, you can use a Jupyter contents manager to store them remotely. We're using https://github.com/uktrade/jupyters3 to store the notebooks remotely on AWS S3.
Michal
Hi @michalc,
thanks for your input. We really aimed to use this, however it seems like our new Jupyterhub Kubernetes deployment does not support changing the spawner.
Do you happen to know how we can use the fargate spawner with a deployment like that?
Cheers Marvin
If the spawner can’t be changed, then at the moment I don’t see how you can use this spawner.
I am curious: why do you want to use this spawner on a Kubernetes cluster? (There may be some background/context I’m missing)
Well, we have no experience really using kube. We were following the ZTJH documentation. However it seems to be a real hassle to setup auto scaling and custom authentication for this.
Thats why I'm currently researching other methods of deployment. The context is that we want to set up a multiuser Jupyterhub which seamlessly scales up and down servers to serve the users. Ideally we wanted to use your fargate spawner for the notebooks and the s3 storage like you mentioned earlier in this issue.
What type of deployment are you guys using if I may ask?
Marvin
At the moment, we've actually moved away from this spawner/JupyterHub, and we're using a different solution to launch JupyterLab in Fargate: https://github.com/uktrade/data-workspace. It allows launching any dockerized web application, not just JupyterLab. However, it's fairly well customised to various other requirements we have: it's not really meant to be a JupyterHub-alternative.
We're also moving away from https://github.com/uktrade/jupyters3 , and investigating an asynchronous sync to s3, using using (a very much in-progress) https://github.com/uktrade/mobius3
Hi @Nop0x
Depends on what you mean exactly...
The values of the variables in the memory of notebooks are not persisted between the servers stopping and restarting: this I believe is how all notebook servers behave, and AFAIK there is no way for this to be different.
The notebooks themselves are not persisted between stop and start if stored locally on the filesystem, and in a way this is unavoidable: Fargate does not support persistent volumes. However, you can use a Jupyter contents manager to store them remotely. We're using https://github.com/uktrade/jupyters3 to store the notebooks remotely on AWS S3.
Michal
@michalc Hi, could you please elaborate a bit on how you configure that, as that's configured for the notebook and not the hub, right? I managed to get it working with the local spawner by modifying the jupyter_notebook_config.py in the users ~/.jupyter folder, but I'm not sure how to translate that over to the FargateSpawner
Admittedly it's almost a year after the question but...
@theblazehen It's fairly independent of the config of the spawner itself: it would be the same if you use JupyterS3 or not. The JupyterS3 configuration is just on the notebook side... If you have configured the notebook with a jupyter_notebook_config.py locally, then you would need a Fargate task that starts a docker image that uses this config file. The current example in the README at https://github.com/uktrade/fargatespawner shows how you can configure the spawner to pass the location of this file to the notebook.
Maybe mounting an EFS volume for persistent storage might help?