jupyterlite icon indicating copy to clipboard operation
jupyterlite copied to clipboard

Ability to load notebook from external URL as URL query

Open oeway opened this issue 2 years ago • 9 comments

Problem

Not sure if this is already supported, but to share notebooks hosted on github with jupyterlite, as I understand, we have to ask the user to first get a notebook file, drag and drop to jupyter lite and start it.

It would be much better if we can pass an external notebook file url to jupyter lite and get a one-click notebook link so the user can click and see the notebook already loaded in the page.

Proposed Solution

Parse the query and and load the notebook from external URL (for example: https://jupyterlite.readthedocs.io/en/latest/_static/lab/index.html?load=https://raw.githubusercontent.com/jupyterlite/jupyterlite/main/examples/python.ipynb), open it in jupyter lite (maybe as unsaved notebook so we can avoid overwriting user's existing notebook and give them a chance to rename it).

oeway avatar Dec 15 '21 11:12 oeway

While this works for some cases, it unfortunately doesn't work for the general case of "things on the internet," as often a proper proxy is needed because of CORS, etc. Related assets (e.g. images referenced in markdown, data files) would not be downloaded either (not that they'd get into the "VM" anyway).

For some of these enhancement requests, we have to consider whether:

  • will every deployment want this feature?
  • is there an existing labextension that provides this feature?
  • would it be better as a new liteextension?
  • would it be better as a new labextension that would work in multiple frontends (real Lab, Retro)?

In this case: I definitely have deployments where we've been trying to reduce the number of rando network requests it makes, so I am somewhat disinclined to make a proxy-like thing part of the standard kit.

Implementing the server bits of jupyterlab-git, jupyterlab-github, or jupyterlab-pull-requests in the browser would be quite excellent... there are several in-browser git implementations, if needed, but e.g. jupyterlab-pull-requests just uses the REST API, for example. If one of those worked, out of the box, that would definitely be something we'd put into the main docs build... but even a jupyterlite/jupyterlite-(git(lab)|pull-requests) that just did the part we needed would be reasonable.

Failing the above, I could see this being an excellent extension, either as a custom storage layer for lite or something that also worked upstream: if it was targeted against something we knew worked pretty well (github), it could expose both a link builder UI (which probably needs making anyhow) as well as implement the command. As a setting, it could offer a different provider, in case someone wanted to deploy it with their GitHub enterprise instance.

as unsaved notebook

There is no such thing, as it happens, without replumbing a lot of stuff: the document needs to exist "on disk" for it to be rendered by the existing machinery. Appending a timestamp to the filename could work, but this would make it hard to use with e.g. RTC's ?room, which would be a most excellent use for this.

One approach might be to load such remote assets as an entirely separate IDrive (this is in fact what juptyerlab-github does), as we're unlikely to do anything with local storage there.

bollwyvl avatar Dec 15 '21 12:12 bollwyvl

Hi @bollwyvl thanks for the quick response and providing these insightful ideas. All these options you mentioned seem to pointing to the direction of switching storage provider, which I think is cool, but what I (and perhaps many users would want) want is something allows quickly import a notebook file into the current storage and open it.

I think the implementation would be really straightforward, maybe a function that do the following:

  1. parse the url in the query
  2. fetch the content
  3. then call the content api of jupyter to add the file to the storage
  4. then open that file

This means we don't change the storage layer, but work with whatever selected by the deployment. It's really just a quick way to importing data which one will have to do manually.

We can also support importing multiple files (e.g. construct an URL with multiple load parameter) so one can pull all the assets with a notebook into the storage, and the constructed url can be shortened via external URL shorten services.

Regarding CORS issues, for sure this won't work for all cases, but in the context of reproducible data science, the common services such as gitlab, github and zenodo, they all support CORS. I would really think this would already been very helpful. For example, one can attach these URLs in a scientific publication and user can click and run them without the hassle of download, importing etc.

Do these make sense?

oeway avatar Dec 15 '21 13:12 oeway

There is also this PR which added a new "Open from URL" command, and is available in the latest JupyterLab 4.0 prereleases: https://github.com/jupyterlab/jupyterlab/pull/11387. Which should also work in JupyterLite. There are still some follow-ups to do such as: https://github.com/jupyterlab/jupyterlab/issues/11531

A third-party extension could provide a plugin that parses ?load= and pass it to that command to achieve that behavior.

jtpio avatar Dec 15 '21 13:12 jtpio

core command and third-party plugin URL hack sounds mighty fine to me. being able to use most labextensions, unmodified will make this pretty trivial. but in the core, everybody-gets-it of jupyterlite, i'm just trying to leaven "wouldn't it be cool if" with "if i was a bad actor, what could i do if"... indeed, we would ideally shed some of the current capability (certainly all the kernels) to get the base install as lean-and-mean as possible.

bollwyvl avatar Dec 15 '21 19:12 bollwyvl

For reference https://github.com/jupyterlite/jupyterlite/pull/528 updated to the latest JupyterLab 3.3.

This brought the "Open from URL" feature added in https://github.com/jupyterlab/jupyterlab/pull/11387.

Which is already functional in the latest 0.1.0b3 release:

https://user-images.githubusercontent.com/591645/158786569-20a84697-ca50-4868-bd66-3b5f4893b893.mp4

So third-party extensions could already make use of this so it can be combined with URL query parameters.

jtpio avatar Mar 17 '22 12:03 jtpio

I think the ability to load a simple notebook JSON file (ex- of any embedded images, etc) in via a JupyterLite URL path= parameter, or a new url= parameter, would be really handy, eg providing a cut down, ad hoc, non-requirements installing MyBinder-like service simply from a webserver served JupyterLite environment.

It does mean that third parties might choose to use your bandwidth to load a JupyterLite environment to run their hosted notebook from a URL, but an optional config switch (like the "collaboration": true config setting) would mean you at least have to opt in to this feature.

psychemedia avatar Mar 17 '22 15:03 psychemedia

FYI: I have implemented this feature in my own instance, for example, I can now do: https://jupyter.imjoy.io/lab/index.html?load=https://gist.github.com/oeway/391b4352ea57b5682366ce3dc2fa9174&open=1

oeway avatar Jun 03 '22 17:06 oeway

@oeway

FYI: I have implemented this feature in my own instance, for example, I can now do: https://jupyter.imjoy.io/lab/index.html?load=https://gist.github.com/oeway/391b4352ea57b5682366ce3dc2fa9174&open=1

Can you provide a pull request with this implementation?

mlhess avatar Jul 12 '22 01:07 mlhess

How did you implement it? Can you explain please?

dgcmain avatar Aug 10 '22 09:08 dgcmain

I would love to have the "open from url" feature supported. I would be using it together with the rest of the HoloViz community at https://panelite.holoviz.org/lab/index.html to share notebooks in an easy to use and fast to load Python environment.

Panelite is built using the instructions in https://panel.holoviz.org/user_guide/Running_in_Webassembly.html#setting-up-jupyterlite.

MarcSkovMadsen avatar Feb 04 '23 18:02 MarcSkovMadsen

As mentioned above in https://github.com/jupyterlite/jupyterlite/issues/430#issuecomment-1070849420, this could likely be implemented as a third-party extension for JupyterLab.

This would have the advantage of working in both stock JupyterLab and JupyterLite (and potentially other lab-based frontends).

jtpio avatar Feb 06 '23 08:02 jtpio

FYI I made this extension really quickly today: https://github.com/jupyterlab-contrib/jupyterlab-open-url-parameter.

It works as mentioned in the comment above: https://github.com/jupyterlite/jupyterlite/issues/430#issuecomment-1070849420, by using the built-in filebrowser:open-url command from JupyterLab.

Here is a screencast of the extension running in a JupyterLite deployed on ReadTheDocs:

https://user-images.githubusercontent.com/591645/230435406-c4f5a2d5-9d4f-4733-b270-82eb0cb65a71.mp4

The repo is available here: https://github.com/jupyterlab-contrib/jupyterlab-open-url-parameter

You can include the jupyterlab-open-url-parameter extension like a regular JupyterLab extension in your JupyterLite deployment: https://jupyterlite.readthedocs.io/en/latest/howto/configure/simple_extensions.html

Happy to add contributors to the repo if someone would like to help maintain the extension. Feel free to open issues and PRs if you would like to improve it. Thanks!

jtpio avatar Apr 06 '23 16:04 jtpio

https://github.com/jupyterlite/jupyterlite/pull/1044 documents this.

jtpio avatar Apr 06 '23 17:04 jtpio