docker-stacks icon indicating copy to clipboard operation
docker-stacks copied to clipboard

[BUG] - Startup issues with low CPU allocation

Open 0nebody opened this issue 3 years ago • 5 comments

What docker image(s) are you using?

base-notebook

OS system and architecture running docker image

linux/amd64

What Docker command are you running?

docker run -v $PWD:/home/jovyan -p 8888:8888 --memory=1000m --cpus=0.5 jupyter/base-notebook:lab-3.4.4

How to Reproduce the problem?

  1. Dock run with half a CPU
  2. Visit http://127.0.0.1:8888
  3. Start and save 7 empty python notebooks
  4. Stop docker run command
  5. Start docker run command again
  6. Visit http://127.0.0.1:8888
  7. Try to use any notebook

Command output

Exception in callback <TaskStepMethWrapper object at 0x7f0d5444fb20>()
handle: <Handle <TaskStepMethWrapper object at 0x7f0d5444fb20>()>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
RuntimeError: Cannot enter into task <Task pending name='Task-101' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:917> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> while another task <Task pending name='Task-100' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:921> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> is being executed.
Exception in callback <TaskStepMethWrapper object at 0x7f0d5444e920>()
handle: <Handle <TaskStepMethWrapper object at 0x7f0d5444e920>()>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
RuntimeError: Cannot enter into task <Task pending name='Task-102' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:917> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> while another task <Task pending name='Task-100' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:921> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> is being executed.
Exception in callback Task.task_wakeup(<Future finished result=None>)
handle: <Handle Task.task_wakeup(<Future finished result=None>)>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
RuntimeError: Cannot enter into task <Task pending name='Task-99' coro=<MultiKernelManager._add_kernel_when_ready() running at /opt/conda/lib/python3.10/site-packages/jupyter_client/multikernelmanager.py:173> wait_for=<Future finished result=None> cb=[Task.task_wakeup()]> while another task <Task pending name='Task-100' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:921> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> is being executed.
Exception in callback <TaskStepMethWrapper object at 0x7f0d5444e4a0>()
handle: <Handle <TaskStepMethWrapper object at 0x7f0d5444e4a0>()>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
RuntimeError: Cannot enter into task <Task pending name='Task-103' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:917> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> while another task <Task pending name='Task-100' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:921> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> is being executed.
Exception in callback <TaskStepMethWrapper object at 0x7f0d5444c4c0>()
handle: <Handle <TaskStepMethWrapper object at 0x7f0d5444c4c0>()>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
RuntimeError: Cannot enter into task <Task pending name='Task-104' coro=<RequestHandler._execute() running at /opt/conda/lib/python3.10/site-packages/tornado/web.py:1665> cb=[_HandlerDelegate.execute.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/web.py:2361]> while another task <Task pending name='Task-100' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:921> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> is being executed.
Exception in callback Task.task_wakeup(<Task finishe...> result=None>)
handle: <Handle Task.task_wakeup(<Task finishe...> result=None>)>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
RuntimeError: Cannot enter into task <Task pending name='Task-93' coro=<RequestHandler._execute() running at /opt/conda/lib/python3.10/site-packages/tornado/web.py:1713> wait_for=<Task finished name='Task-97' coro=<MultiKernelManager._add_kernel_when_ready() done, defined at /opt/conda/lib/python3.10/site-packages/jupyter_client/multikernelmanager.py:169> result=None> cb=[_HandlerDelegate.execute.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/web.py:2361]> while another task <Task pending name='Task-100' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:921> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> is being executed.
Exception in callback Task.task_wakeup(<Task finishe...> result=None>)
handle: <Handle Task.task_wakeup(<Task finishe...> result=None>)>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
RuntimeError: Cannot enter into task <Task pending name='Task-94' coro=<RequestHandler._execute() running at /opt/conda/lib/python3.10/site-packages/tornado/web.py:1713> wait_for=<Task finished name='Task-98' coro=<MultiKernelManager._add_kernel_when_ready() done, defined at /opt/conda/lib/python3.10/site-packages/jupyter_client/multikernelmanager.py:169> result=None> cb=[_HandlerDelegate.execute.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/web.py:2361]> while another task <Task pending name='Task-100' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:921> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> is being executed.
Traceback (most recent call last):
  File "/opt/conda/bin/jupyter-lab", line 10, in <module>
    sys.exit(main())
  File "/opt/conda/lib/python3.10/site-packages/jupyter_server/extension/application.py", line 594, in launch_instance
    serverapp.start()
  File "/opt/conda/lib/python3.10/site-packages/jupyter_server/serverapp.py", line 2815, in start
    self.start_ioloop()
  File "/opt/conda/lib/python3.10/site-packages/jupyter_server/serverapp.py", line 2801, in start_ioloop
    self.io_loop.start()
  File "/opt/conda/lib/python3.10/site-packages/tornado/platform/asyncio.py", line 215, in start
    self.asyncio_loop.run_forever()
  File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 600, in run_forever
    self._run_once()
  File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1881, in _run_once
    handle = self._ready.popleft()
IndexError: pop from an empty deque
/opt/conda/lib/python3.10/configparser.py:1250: RuntimeWarning: coroutine 'PeriodicCallback._run' was never awaited
  getter = functools.partial(self.get, _impl=getattr(parser, key))
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Task was destroyed but it is pending!
task: <Task pending name='Task-101' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:917> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]>
Task was destroyed but it is pending!
task: <Task pending name='Task-102' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:917> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]>
Task was destroyed but it is pending!
task: <Task pending name='Task-103' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:917> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]>
/opt/conda/lib/python3.10/configparser.py:1250: RuntimeWarning: coroutine 'RequestHandler._execute' was never awaited
  getter = functools.partial(self.get, _impl=getattr(parser, key))
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Task was destroyed but it is pending!
task: <Task pending name='Task-104' coro=<RequestHandler._execute() running at /opt/conda/lib/python3.10/site-packages/tornado/web.py:1665> cb=[_HandlerDelegate.execute.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/web.py:2361]>
Task was destroyed but it is pending!
task: <Task pending name='Task-93' coro=<RequestHandler._execute() running at /opt/conda/lib/python3.10/site-packages/tornado/web.py:1713> wait_for=<Task finished name='Task-97' coro=<MultiKernelManager._add_kernel_when_ready() done, defined at /opt/conda/lib/python3.10/site-packages/jupyter_client/multikernelmanager.py:169> result=None> cb=[_HandlerDelegate.execute.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/web.py:2361]>
Task was destroyed but it is pending!
task: <Task pending name='Task-94' coro=<RequestHandler._execute() running at /opt/conda/lib/python3.10/site-packages/tornado/web.py:1713> wait_for=<Task finished name='Task-98' coro=<MultiKernelManager._add_kernel_when_ready() done, defined at /opt/conda/lib/python3.10/site-packages/jupyter_client/multikernelmanager.py:169> result=None> cb=[_HandlerDelegate.execute.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/web.py:2361]>

Expected behavior

Workspace with multiple notebooks running as was the case before the container was restarted.

Actual behavior

Environment is unusable. None of the kernels start the second time.

Anything else?

Issues seems to be resolved with the following modifications to the base image. Best tested with a new bind mount volume.

FROM jupyter/base-notebook:lab-3.4.4

RUN mamba install --quiet --yes \
    'tornado==6.1.0' \
    'nest-asyncio==1.5.1' && \
    mamba clean --all -f -y && \
    fix-permissions "${CONDA_DIR}" && \
    fix-permissions "/home/${NB_USER}"

0nebody avatar Aug 10 '22 22:08 0nebody

Why do you think tornado and nest-asyncio fixes the issue? Do later version of tornado and nest-asyncio work as well? Thx

Bidek56 avatar Aug 18 '22 16:08 Bidek56

It does not fix the issue, but it makes it much less common. Only a couple of percent of launches will fail after using the versions specified. This is the simplest reproducible case that I could create for this issue. Is it something that you can also reproduce?

I believe it is probably related to the following issue https://github.com/jupyterlab/jupyterlab/issues/11934

0nebody avatar Aug 18 '22 22:08 0nebody

This is a jupyterlab issue which will be fixed in the future. If you need it sooner please submit a jupyterlab PR based on this issue.

Bidek56 avatar Aug 19 '22 12:08 Bidek56

There seems to be a PR for this issue already.

Bidek56 avatar Aug 19 '22 19:08 Bidek56

Thank you @Bidek56. We will wait for the upstream fix then.

mathbunnyru avatar Aug 21 '22 18:08 mathbunnyru

@minrk @blink1073 @kevin-bates could you please take a look here (and maybe to the attached PR)?

mathbunnyru avatar Oct 13 '22 15:10 mathbunnyru

We're are going to remove the use of nest-asyncio entirely in client 8: https://github.com/jupyter/jupyter_client/issues/833. There is already a PR merged that does this: https://github.com/jupyter/jupyter_client/pull/835, and an alpha release.

blink1073 avatar Oct 13 '22 17:10 blink1073

Great news @blink1073, thank you! We will be waiting for release then.

mathbunnyru avatar Oct 15 '22 14:10 mathbunnyru

Upstream issue was resolved, and the latest Jupyter-client doesn't have nest-asyncio. I have also checked the issue you have (I was having it as well) and it seems to work fine now.

mathbunnyru avatar Feb 28 '23 12:02 mathbunnyru