jupyter-resource-usage icon indicating copy to clipboard operation
jupyter-resource-usage copied to clipboard

Feedback/Issues with Kernel Usage

Open nishikantparmariam opened this issue 2 years ago • 4 comments

Description

When kernel usage is opened in right panel and Restart kernel and run all cells button (>>) is clicked, Jupyterlab shows The kernel for ... appears to have died. It will start automatically and below error is printed into console:

jlab_core.e37d4bbc8c984154bc26.js?v=e37d4bbc8c984154bc26:2 Uncaught (in promise) Error: Canceled future for kernel_info_request message before replies were done
    at u.dispose (jlab_core.e37d4bbc8c984154bc26.js?v=e37d4bbc8c984154bc26:2:1006780)
    at jlab_core.e37d4bbc8c984154bc26.js?v=e37d4bbc8c984154bc26:2:1000601
    at Map.forEach (<anonymous>)
    at b._clearKernelState (jlab_core.e37d4bbc8c984154bc26.js?v=e37d4bbc8c984154bc26:2:1000586)
    at jlab_core.e37d4bbc8c984154bc26.js?v=e37d4bbc8c984154bc26:2:1003130

And, below is printed on jupyter-server:

Got events for closed stream None

By looking at the errors is it possible that it might be happening because we query the channel to get kernel info when it is restarting?

Reproduce

  1. Open Jupyterlab and some notebook having cells with say print('Hi')
  2. Open kernel usage from right panel and keep it open
  3. Click Restart kernel and run all cells (>>) button.

The issue is intermittent and is not consistently reproduceble. Doing the last step above multiple times may reproduce the issue.

Expected behavior

The error should not be thrown when right panel is kept open and Restart kernel and run all cells button is clicked.

Context

jupyter-server==1.23.2
jupyter-resource-usage==0.7.0
ipykernel==6.15.1
jupyterlab==3.5.0
ipython==8.8.0

nishikantparmariam avatar Jan 31 '23 08:01 nishikantparmariam

Adding two more issues / feedback for kernel usage:

  • Long running cells having different types of code show different results:

       while True:
          x=1
    

    This shows Kernel usage is not available whereas -

    import time
    time.sleep(100)
    

    shows the kernel stats.

  • UX related - kernel usage panel does not always show stats of active notebook, reproducer:

    1. Open kernel usage panel in right.
    2. Open two notebooks one-by-one say A.ipynb and B.ipynb (kernel usage panel should show stats for them correctly - first A and then B)
    3. Open Launcher (Ctrl + Shift + L) -> it shows stats of last active notebook (B) instead it should show no stats.
    4. Close B.ipynb directly from tab (without opening that tab) such that we still remain at Launcher -> it shows stats of notebook A instead it should show no stats.
    5. When A.ipnyb is closed in similar manner -> it shows no stats as expected.

nishikantparmariam avatar Feb 01 '23 14:02 nishikantparmariam

Thanks @nishikantparmariam for reporting.

cc @krassowski @echarles who might be more familiar with the kernel usage panel since it comes from https://github.com/Quansight/jupyterlab-kernel-usage

jtpio avatar Feb 14 '23 13:02 jtpio

Restart kernel and run all cells button (>>) is clicked, Jupyterlab shows The kernel for ... appears to have died. It will start automatically and below error is printed into console:

I see the error in the console (but not the "kernel appears to have died message" - I will keep looking to reproduce it). The problem seems to be that we are sending the first request before the kernel is ready. I will open a PR if I can get this fixed.

UX related - kernel usage panel does not always show stats of active notebook, reproducer:

PR coming.

Long running cells having different types of code show different results:

The example with infinite loop randomly shows "Kernel usage is not available" or the usage. This is because the usage_request sent to kernel times out and we get into this branch:

https://github.com/jupyter-server/jupyter-resource-usage/blob/7299c08ef8df79e028443d15ac8f2f4253c5793b/jupyter_resource_usage/api.py#L112-L115

Just-merged PR #177 de facto increased the timeout from 1 to 6 seconds so the "not available" message shows up less frequently (because we time out less frequently) but as long as ipykernel usage_request relies on threading rather than multiprocessing this will still show up from time to time. This seems to stem from how threads are implemented in Python, more specifically GIL. For example:

import sys
sys.setswitchinterval(0.0000001)
while True:
    x = 1
# see that "not available" message (almost) never shows up
import sys
sys.setswitchinterval(10)
while True:
    x = 1
# see that "not available" always show up

Since changing user switch interval is not something we should do, we can instead consider increasing the timeout further. I propose that we:

  • set the timeout to 10 seconds
  • add a more informative message on the frontend if the timeout does occur (preserving the old usage statistics but greying them out).

krassowski avatar Feb 18 '23 21:02 krassowski

Of note, Canceled future for kernel_info_request message before replies were done appears whether this extension is installed or not when quickly restarting kernel.

krassowski avatar Feb 19 '23 00:02 krassowski