deno
deno copied to clipboard
First class Jupyter notebook integration
Jupyter notebooks are very popular. They provide rich, interactive environment for development.
There are numerous kernels that support JavaScript and TypeScript:
- https://github.com/n-riesco/ijavascript
- https://github.com/yunabe/tslab
- https://github.com/winnekes/itypescript
Prompted by discussion with @apowers313 who's working on a kernel for Deno (https://github.com/apowers313/ideno) I propose we add first-class support for Jupyter in Deno (with deno jupyter
subcommand).
I'd argue that providing first class support for Jupyter will open Deno to a whole community of people using Jupyter notebooks, additionally providing the community with new powerful tools (after all Deno supports WebGPU out-of-the-box) and could help significantly Machine Learning applications of Deno.
This proposal is motivated by several things; firstly Deno originated from a similar idea to Jupyter called PropelML that never fully materialized. Secondly, @apowers313 will have to integrate with V8 inspector protocol to provide kernel functionality. Currently Deno doesn't have a programatic API to interact with the inspector so it will require quite an effort to integrate over Websocket. Additionally most of the functionality that has to be provided for kernel is already working in the REPL. In fact most of REPL functionality could be reused in the kernel, we would have to add communication protocol APIs to integrate with the kernel.
@kitsonk was eyeballing implementation of the kernel in Q2/Q3 but that never materialized due to other more pressing work. I'll be happy to spearhead the effort as it seems like a very fun project to work on.
Roadmap to creating a kernel:
- [x] create a kernel spec
- see
jupyter kernelspec list
for examples andjupyter kernelspec install
to install - run
jupyter notebook
-- if installed correctly, it will show up under "New" in the top right corner of the Jupyter web browser - your kernel will be started as a command line application with the arguments specified in the kernel spec. it won't start until you select "new" in the menu
- see
- [x] create zeromq connections
- the kernel spec will specify a
{connection_file}
that gets converted to a JSON connection file describing IP / ports to connect to - the first messages received will be "kernel_info_request" and "comm_info_request" on the shell zmq Dealer connection (examples of the packets can be found here)
- you can set Jupyter into debug mode so that
jupyter notebook
prints the packets it sends / receives:- create a config file using
jupyter notebook --generate-config
- set the following options in the config:
- c.Application.log_level = 'DEBUG'
- c.JupyterApp.log_level = 'DEBUG'
- c.NotebookApp.log_level = 'DEBUG'
- c.Session.debug = True
- create a config file using
- the kernel spec will specify a
- [x] next create the IOPub zmq Publisher to send "busy" and "idle" packets
- [x] next handle the "execute_request" message, which sends code from the front-end to the kernel
- when a user selects "restart and run all" in the front end, I think it sends all the Jupyter cells at once, you will have to queue execution requests and run them one at a time
- each execution request sends multiple replies, and each reply is expected to embed the original packet
header
in the reply as aparentHeader
, so you'll have to keep state for which task is currently running - you will need to capture
stdout
andstderr
from the kernel and send them back to the front end as stream messages
- [ ] implement "kernel_shutdown" and "kernel_restart" on the Control zmq Dealer connection
- note that the "kernel_shutdown" request has a "restart" option, which presumes that you are starting up a clean JS environment. I have no idea how this is going to work if the kernel and the execution context are running in the same Deno instance.
- I also have no idea how you are going to interrupt Deno mid-execution.
- [x] implement display data to send PNG, SVG, HTML, JSON back to the browser to be rendered in the front end
- I think these should automatically render for objects that have Symbols on them, similar to toStringTag. e.g. Symbol.toPngTag. Maybe TC39 worthy?
- [ ] implement an interpreter to parse out line magics and cell magics
- pay special attention to:
- automagic, which turns off requiring "%" at the front of magics. instead of requiring a magic like "%ls" the user can just type "ls"
- !cmd command execution
- magic assignment like "output = %ls"
- {var} substitution
- inline documentation and inspection like "?" and "??". I dream of a Symbol.toDocTag on Objects that contains documentation (maybe populated by JSDoc comments) or URLs to documentation, similar to Python's docstrings. Might be a TC39 proposal?
- input and output caching in In[n] and Out[n] (also
_
,__
, and___
)
- feel free to steal magics or the interpreter from magicpatch.
- there should probably be an API to enable users to add their own magics.
- pay special attention to:
- [ ] implement introspection and completion
- [ ] maybe implement code completeness which is only used by command line Jupyter front ends to determine when to execute code
Sorry, I realize that's a lot... hopefully it's helpful.
A bit of a hurdle in the integration is the fact that the only crate that provides async integration with ZeroMQ is currently marked as unstable and not recommended to use in production: https://github.com/zeromq/zmq.rs
This crate builds on top of: https://github.com/erickt/rust-zmq which provides sync bindings (which might not be a big deal), but it seems its built process might be quite involved.
I will do some more research on this topic before proceeding.
@apowers313 thank you for providing the roadmap, this is very helpful!
If you get painted into a corner, Jupyter appears to require a very small subset of ZMQ: it appears to only use NULL security, send ~4 packets to negotiate the session, and then has a control
/ length
header for each data chunk. There's a Wireshark plugin for ZMQ if you want to see how it works. (Note: I had to use an older commit to get the plugin to work)
@apowers313 perfect!
I'd like some feedback on where / how to implement the user-facing Jupyter API for the Deno Jupyter kernel. This would be the API for users to display charts / images, displaying object specific documentation, add new magics, etc.
I think regardless we will want a Deno.core.jupyter
interface, which will only be instantiated when Deno is running in Jupyter (useful for feature detection), and that interface will have Deno.core.jupyter.display(mimeType, data)
for rendering and saving formatted data in Jupyter. Similarly, it would have Deno.core.jupyter.addmagic(name, fn)
for user-implemented functions. This enables users to import modules that will detect Jupyter and implement new functionality (similar to how %matplotlib
works in Python's Jupyter today.
Requested Feedback 1: I'd be interested if anyone objects to Deno.core.jupyter
as a direction.
The part where design decisions are needed is an interface / protocol for Objects to automatically convert them to structured data types. For example, if a user returns an object implementing Foo.toPng
that function should be called and the returned data should be rendered as a PNG.
Requested Feedback 2: Three options for how to do this:
-
Foo.toPng()
-- Seems antiquated and potentially has namespace conflicts since it isn't Symbol based -
Foo[Symbol(Deno.toPng)]
-- Deno-wide specific decoding of Objects, similar to Deno.customInspect. This allows the entire Deno ecosystem to benefit from this feature, not just Jupyter and eventually enables whatever comes after Jupyter or other new innovations. -
Foo[Symbol(Deno.core.jupyter.toPng)]
-- Jupyter only symbols, not nearly as useful but keeps them out of the rest of Deno if people don't think this functionality is going to be broadly useful. -
Foo[Symbol(Symbol.toPng)]
-- Requires modifying Symbol, similar totoStringTag
, but potentially benefits all of JS. Might require a TC39 proposal to ensure that Deno doesn't drift from ECMAScript specs.
Thanks!
I just checked in a proposed API for Jupyter display:
-
display(mimeType, uint8Buf, opts)
-
displayPngFile(path, opts)
-
displayPng(buf, opts)
-
displayFile(path)
-- guesses file type based on file extension
I'm trying to decide if it would be more convenient to overload displayPng
with all the different types it could support (buf, file path, stream, whatever tomorrow's thing is...) or if it's better to have different function calls for each input type. Any thoughts would be appreciated.
I think regardless we will want a
Deno.core.jupyter
interface, which will only be instantiated when Deno is running in Jupyter (useful for feature detection), and that interface will haveDeno.core.jupyter.display(mimeType, data)
for rendering and saving formatted data in Jupyter. Similarly, it would haveDeno.core.jupyter.addmagic(name, fn)
for user-implemented functions. This enables users to import modules that will detect Jupyter and implement new functionality (similar to how%matplotlib
works in Python's Jupyter today.
Requested Feedback 1: I'd be interested if anyone objects to Deno.core.jupyter as a direction.
The part where design decisions are needed is an interface / protocol for Objects to automatically convert them to structured data types. For example, if a user returns an object implementing Foo.toPng that function should be called and the returned data should be rendered as a PNG.
Sounds good to me, but it should be Deno.jupyter
namespace instead of Deno.core.jupyter
.
Foo.toPng() -- Seems antiquated and potentially has namespace conflicts since it isn't Symbol based Foo[Symbol(Deno.toPng)] -- Deno-wide specific decoding of Objects, similar to Deno.customInspect. This allows the entire Deno ecosystem to benefit from this feature, not just Jupyter and eventually enables whatever comes after Jupyter or other new innovations. Foo[Symbol(Deno.core.jupyter.toPng)] -- Jupyter only symbols, not nearly as useful but keeps them out of the rest of Deno if people don't think this functionality is going to be broadly useful. Foo[Symbol(Symbol.toPng)] -- Requires modifying Symbol, similar to toStringTag, but potentially benefits all of JS. Might require a TC39 proposal to ensure that Deno doesn't drift from ECMAScript specs.
In this case I think we should use something like Symbol.for("Deno.jupyter")
similar to Symbol.for("Deno.customInspect")
.
I just checked in a proposed API for Jupyter display:
display(mimeType, uint8Buf, opts)
displayPngFile(path, opts)
displayPng(buf, opts)
displayFile(path)
-- guesses file type based on file extensionI'm trying to decide if it would be more convenient to overload
displayPng
with all the different types it could support (buf, file path, stream, whatever tomorrow's thing is...) or if it's better to have different function calls for each input type. Any thoughts would be appreciated.
I believe the "overload" approach would be better in this case - we already use this approach in numerous Deno
APIs.
Deno.jupyter.display(mimeType: string, buf: Uint8Array, opts);
Deno.jupyter.displayPng(pathOrBuf: string | Uint8Array, opts);
Deno.jupyter.displayFile(path: string);
Seem preferable, what are the opts
that could be used for displaying files?
Is this, and Ideno still being worked on?
Nope, I stopped working on IDeno in favor of the built-in Jupyter kernel. The built-in kernel stalled out because the ZMQ library we were using had some bugs.
IDeno was mostly functional, happy to pass the baton if anyone wants to pick it up.
Hey @tif-calin, @apowers313! We did the kernel mostly working but that ZMQ library bug was quite serious and it was happening very often (it manifested itself every 3-4 connections). If there was a different library that we could use, then we should be able to revive that PR without much trouble and that still seems like a great feature for many people.
I would be very interested in seeing this happening. Any way I can help?
I would be very interested in seeing this happening. Any way I can help?
Fix the Rust ZMQ library? :)
Do you have a specific bug that needs to be fixed? Is it filed somewhere?
Do you have a specific bug that needs to be fixed? Is it filed somewhere?
https://github.com/zeromq/zmq.rs/issues/153
Also, for a mostly working non-Rust version of a kernel: https://github.com/apowers313/ideno
Love this. Where has the work coalesced?
Background: I'm a longtime Jupyter, IPython, and ZeroMQ maintainer. I'd love to help steward this work.
@rgbkrk still stuck on this bug as far as I can tell: https://github.com/zeromq/zmq.rs/issues/153
Hey @rgbkrk, thanks for stopping by. So @apowers313 and I had a PR that was quite close to landing (https://github.com/denoland/deno/pull/13122) unfortunately the bug above caused it to be very flaky (2/3 times you opened a notebook it resulted in Broken pipe
error). Besides that, the PR was more or less ready to land.
We recently discussed this feature with @crowlKats and @dsherret and we'd like to resurrect the PR, we were thinking of maybe rewriting parts of zmq.rs
that are necessary for Jupyter kernel purely in Rust and with Tokio integration in mind. If you have other ideas I'd be more than happy to hear them!
Looks like I'm going to have to learn Rust. While it might not be the best, you might get more reliability more quickly by building on top of libzmq even though it would pale in comparison to native rust bindings. As far as I can tell, there's a lot to be tested within zmq.rs.
I'm curious if this jupyter rust kernel ran into the same issues. Have you all checked that out too?
I'm curious if this jupyter rust kernel ran into the same issues. Have you all checked that out too?
I did not know about that project. I can certainly check it.
Let me get my PR rebased and reopened so we can discuss over code.
Opened #20337 that is rebased against main
.
FYI, it looks like the PR above works quite nicely with notebook integration in VSCode, but all hell breaks loose when I try it with jupyter notebook
. I think the PR is quite close to being landable, it probably needs 5-10h of work to polish it and release.