jupyter_server
jupyter_server copied to clipboard
Kernelspec Caching
Kernelspec Caching
This pull request introduces a new configurable feature called KernelSpec Caching. The KernelSpecCache instance supports the same retrieval methods as a KernelSpecManager and contains the configured KernelSpecManager instance. If caching is not enabled (by default), the cache is a direct pass-through to the KernelSpecManager, otherwise it acts as a read through cache, deferring to the KernelSpecManager on any cache misses. This functionality has proven useful in Enterprise Gateway where it has existed for a few years. In that implementation, the watchdog package is used to determine cache updates.
By introducing kernelspec caching, we can now define events corresponding to the addition, update, and deletion of kernel specifications and get closer to removing the 10-second polling performed by Lab once it has been updated to consume kernelspec events.
Monitors
Besides its enablement via the cache_enabled configurable, KernelSpecCache supports pluggable monitors that are responsible for detecting changes to the cached items and keeping the cache updated due to out-of-band updates. A kernelspec cache monitor is registered via the entry points group "jupyter_server.kernelspec_monitors" to introduce a layer of decoupling. This pull request includes two monitors:
KernelSpecWatchdogMonitorunder the entry point name"watchdog-monitor": This monitor uses thewatchdogpackage to monitor changes to directories containing kernelspec definitions. Because this monitor uses thewatchdogpackage, an optional dependency has been added for users wishing to use this monitor:pip install jupyter_server[watchdog-monitor]KernelSpecPollingMonitorunder the entry point name"polling-monitor": This monitor periodically polls (via a configurableintervaltrait) for kernelspec changes and computes an MD5 hash on each entry to further determine changes. It only updates the cache when the hash values have changed (or are new) and when it determines a kernelspec has been removed. The interval's default value is 30 (sec).
KernelSpecPollingMonitor is the default monitor used since it does not introduce new packages.
Other monitors that would be useful are:
- an event consumer monitor where, once we add event production to the cache, then an event consumer monitor could be used to receive kernelspec update events from remote Kernel Servers (e.g., Gateway or jupyverse) rather than having to rely on polling.
- a monitor similar to the
"watchdog-monitor", but usingwatchfilesinstead, since we have other needs forwatchfiles. At that point, we may be able to make"watchfiles-monitor"the default - assuming we include thewatchfilespackage.
KernelSpec caching is disabled by default. If we want to enable it by default, we'll need to adjust some tests (probably only kernelspecs and perhaps kernels api tests) to configure a much shorter polling interval, or switch the default to the "watchfiles-monitor" (once implemented) since it sounds like there's a preference to watchfiles over watchdog. (Note: Enterprise Gateway happened to use watchdog a few years ago, thus the reason the watchdog monitor exists. If we built a "watchfiles-monitor", I have no affinity for "watchdog-monitor" and see no reason to keep it unless we find advantages over watchfiles.)
Class Hierarchy
KernelSpecCache contains instances of the KernelSpecManager and KernelSpecMonitorBase that corresponds to the monitor_name configurable.
---
title: KernelSpecCache Class Hierarchy
---
classDiagram
KernelSpecCache --* KernelSpecManager
KernelSpecCache --* KernelSpecMonitorBase
KernelSpecMonitorBase <|-- KernelSpecWatchdogMonitor
KernelSpecMonitorBase <|-- KernelSpecPollingMonitor
KernelSpecMonitorBase <|-- BYOMonitor
class KernelSpecCache{
+str monitor_name
+bool cache_enabled
+get_kernel_spec(name)
+get_all_specs()
+get_item(name)
+get_all_items()
+put_item()
+put_all_items()
+remove_item(name)
+remove_all_items()
}
class KernelSpecManager{
+get_kernel_spec(name)
+get_all_specs()
}
class KernelSpecMonitorBase["KernelSpecMonitorBase (ABC)"]{
+initialize()*
+destroy()*
}
class KernelSpecPollingMonitor{
+float interval
}
Event Support (Future)
With caching in place, we should be able to fire add and update kernelspec events from KernelSpecCache.put_item() and delete kernelspec events from KernelSpecCache.remove_item() since their corresponding all items methods simply call on the singleton versions.
Hi @blink1073. The kernelspec cache test failures are due to a missing watchdog dependency. I figured placing it into the "test" optional dependencies would address this, but don't really see that optional dependency being used other than for the examples tests. Is there an additional location I should be updating for this to be included in the Python tests? Is this kind of thing typically performed in jupyterlab/maintainer-tools/.github/actions/base-setup@v1?
EDIT: Hmm, I'm also seeing a different issue not seen in dev (sorry about that), so will need to address that as well, but I think my questions are still applicable (although it looks like Windows must include watchdog anyway - or some resolution has occurred there).
TypeError: 'type' object is not subscriptable. We can't use dict[str, str] until Python 3.9 iirc, you'll need to use Dict from typing. When we run hatch run test or hatch run cov it should pick up the test dependencies.
Thanks Steve. Yeah, that's the issue I saw as well and things are moving along better in my fork. I had originally thought it was the watchdog dependency but, after closer inspection, it was all about the indexing issue. Looks like the docs build is failing due to the new sub folder for the monitors that I'll follow up on. Thanks for your help.
~To help my understanding, does jupyterlab/maintainer-tools/.github/actions/base-setup@v1 essentially issue an install with the optional [test] dependency?~ Crap - I just need to FULLY READ your previous response. :smile: