jupyter-resource-usage
jupyter-resource-usage copied to clipboard
Feedback/Issues with Kernel Usage
Description
When kernel usage is opened in right panel and Restart kernel and run all cells button (>>) is clicked, Jupyterlab shows The kernel for ... appears to have died. It will start automatically and below error is printed into console:
jlab_core.e37d4bbc8c984154bc26.js?v=e37d4bbc8c984154bc26:2 Uncaught (in promise) Error: Canceled future for kernel_info_request message before replies were done
at u.dispose (jlab_core.e37d4bbc8c984154bc26.js?v=e37d4bbc8c984154bc26:2:1006780)
at jlab_core.e37d4bbc8c984154bc26.js?v=e37d4bbc8c984154bc26:2:1000601
at Map.forEach (<anonymous>)
at b._clearKernelState (jlab_core.e37d4bbc8c984154bc26.js?v=e37d4bbc8c984154bc26:2:1000586)
at jlab_core.e37d4bbc8c984154bc26.js?v=e37d4bbc8c984154bc26:2:1003130
And, below is printed on jupyter-server:
Got events for closed stream None
By looking at the errors is it possible that it might be happening because we query the channel to get kernel info when it is restarting?
Reproduce
- Open Jupyterlab and some notebook having cells with say
print('Hi') - Open kernel usage from right panel and keep it open
- Click
Restart kernel and run all cells(>>) button.
The issue is intermittent and is not consistently reproduceble. Doing the last step above multiple times may reproduce the issue.
Expected behavior
The error should not be thrown when right panel is kept open and Restart kernel and run all cells button is clicked.
Context
jupyter-server==1.23.2
jupyter-resource-usage==0.7.0
ipykernel==6.15.1
jupyterlab==3.5.0
ipython==8.8.0
Adding two more issues / feedback for kernel usage:
-
Long running cells having different types of code show different results:
while True: x=1This shows
Kernel usage is not availablewhereas -import time time.sleep(100)shows the kernel stats.
-
UX related - kernel usage panel does not always show stats of active notebook, reproducer:
- Open kernel usage panel in right.
- Open two notebooks one-by-one say
A.ipynbandB.ipynb(kernel usage panel should show stats for them correctly - first A and then B) - Open Launcher (Ctrl + Shift + L) -> it shows stats of last active notebook (
B) instead it should show no stats. - Close B.ipynb directly from tab (without opening that tab) such that we still remain at Launcher -> it shows stats of notebook A instead it should show no stats.
- When A.ipnyb is closed in similar manner -> it shows no stats as expected.
Thanks @nishikantparmariam for reporting.
cc @krassowski @echarles who might be more familiar with the kernel usage panel since it comes from https://github.com/Quansight/jupyterlab-kernel-usage
Restart kernel and run all cells button (>>) is clicked, Jupyterlab shows The kernel for ... appears to have died. It will start automatically and below error is printed into console:
I see the error in the console (but not the "kernel appears to have died message" - I will keep looking to reproduce it). The problem seems to be that we are sending the first request before the kernel is ready. I will open a PR if I can get this fixed.
UX related - kernel usage panel does not always show stats of active notebook, reproducer:
PR coming.
Long running cells having different types of code show different results:
The example with infinite loop randomly shows "Kernel usage is not available" or the usage. This is because the usage_request sent to kernel times out and we get into this branch:
https://github.com/jupyter-server/jupyter-resource-usage/blob/7299c08ef8df79e028443d15ac8f2f4253c5793b/jupyter_resource_usage/api.py#L112-L115
Just-merged PR #177 de facto increased the timeout from 1 to 6 seconds so the "not available" message shows up less frequently (because we time out less frequently) but as long as ipykernel usage_request relies on threading rather than multiprocessing this will still show up from time to time. This seems to stem from how threads are implemented in Python, more specifically GIL. For example:
import sys
sys.setswitchinterval(0.0000001)
while True:
x = 1
# see that "not available" message (almost) never shows up
import sys
sys.setswitchinterval(10)
while True:
x = 1
# see that "not available" always show up
Since changing user switch interval is not something we should do, we can instead consider increasing the timeout further. I propose that we:
- set the timeout to 10 seconds
- add a more informative message on the frontend if the timeout does occur (preserving the old usage statistics but greying them out).
Of note, Canceled future for kernel_info_request message before replies were done appears whether this extension is installed or not when quickly restarting kernel.