ipykernel
ipykernel copied to clipboard
Kernel subshells (JEP91) implementation
This is the implementation of the kernel subshells JEP (jupyter/enhancement-proposals#91). It follows the latest commit (1f1ad3d) with the addition of a %subshell magic command that is useful for debugging. To try this out I have a JupyterLab branch that talks to this branch and is most easily tried out using https://mybinder.org/v2/gh/ianthomas23/jupyterlab/jep91_demo?urlpath=lab; once the mybinder instance has started, open the subshell_demo_notebook.ipynb and follow the instructions therein.
The idea is that this is mergeable as it is now, it is backward compatible in that it does not break any existing use of ipykernel (subject to CI confirmation). There are some ramifications of the protocol additions (outlined below) that will need addressing eventually, but I consider these future work that can be in separate PRs.
Outline of changes
- The parent subshell (i.e. the main shell) runs in the main thread.
- Each new subshell runs in a separate thread.
- There is a new thread that deals with all communication on the shell channel, previously this was performed in the main thread.
- Communication between the shell channel thread and other threads is performed using ZMQ inproc pair sockets, which are essentially shared memory and avoid the use of thread synchronisation primitives.
- Incoming shell messages are handled by the shell channel thread which extracts the
subshell_idfrom the message and passes it on to the correct subshell. - Subshells are created and deleted via messages sent on the control channel. These are passed to the shell channel thread via inproc pair sockets so that the
SubshellManagerin the shell channel thread is responsible for subshell lifetimes.
Example scenario
Here is an example of the communication between threads when running a long task in the parent subshell (main thread) and whilst this is running a child subshell is created, used, and deleted.
sequenceDiagram
participant client as Client
participant control as Control thread
participant shell as Shell channel thread
participant main as Main thread
client->>+shell: Execute request (main shell)
shell->>-main: Execute request (inproc)
activate main
client->>+control: Create subshell request
control->>-shell: Create subshell request (inproc)
activate shell
create participant subshell as Subshell thread
shell-->>subshell: Create subshell thread
shell->>control: Create subshell reply (inproc)
deactivate shell
activate control
control->>-client: Create subshell reply
client->>+shell: Execute request (subshell)
shell->>-subshell: Execute request (inproc)
activate subshell
subshell->>shell: Execute reply (inproc)
deactivate subshell
activate shell
shell->>-client: Execute reply (subshell)
client->>+control: Delete subshell request
control->>-shell: Delete subshell request (inproc)
activate shell
destroy subshell
shell-->>subshell: Delete subshell thread
shell->>control: Delete subshell reply (inproc)
deactivate shell
activate control
control->>-client: Delete subshell reply
main->>shell: Execute reply (inproc)
deactivate main
activate shell
shell->>-client: Execute reply (main shell)
Future work
ipykernel
- Shell channel thread deserialises ~the whole~ some of the message to get the
subshell_id. Ideally it would only deserialise the header. May need changes in Jupyter Client. - Signalling a subshell to stop uses a
threading.Eventfollowing the existinganyioimplementation which requires an extra thread perEvent. It would be nice if this could be changed so a subshell is a single thread not two. - Execution count. Should either be a separate count per subshell or a single count for a kernel. Needs a decision and changes in IPython as is currently not atomic.
- History. Related to item 2 above.
input()on more than one subshell at the same time run but do not store correctly.- Debugger use needs investigating.
- Busy/idle status needs investigating. Should there, as now, be separate status for each subshell, or the concept of kernel (i.e. any subshell) busy status? This issue is much wider than subshells as it includes status of the control channel, and how Jupyter Server should track status (jupyter-server/jupyter_server#1429).
- Use of display hooks for e.g. Matplotlib. Should these be on the parent subshell, or child subshells too?
JupyterLab
The JupyterLab branch I am using to demo this isn't really intended to be merged. But if it was, it needs:
- Check
kernel_infoto see if subshells are supported. - Delete subshell when close a subshell's
ConsolePanel. - Report subshell IDs in tree view?
- Display of subshell busy/idle status.
(Edited for clarity)