docker-stacks
docker-stacks copied to clipboard
[BUG] - Startup issues with low CPU allocation
What docker image(s) are you using?
base-notebook
OS system and architecture running docker image
linux/amd64
What Docker command are you running?
docker run -v $PWD:/home/jovyan -p 8888:8888 --memory=1000m --cpus=0.5 jupyter/base-notebook:lab-3.4.4
How to Reproduce the problem?
- Dock run with half a CPU
- Visit http://127.0.0.1:8888
- Start and save 7 empty python notebooks
- Stop docker run command
- Start docker run command again
- Visit http://127.0.0.1:8888
- Try to use any notebook
Command output
Exception in callback <TaskStepMethWrapper object at 0x7f0d5444fb20>()
handle: <Handle <TaskStepMethWrapper object at 0x7f0d5444fb20>()>
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
RuntimeError: Cannot enter into task <Task pending name='Task-101' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:917> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> while another task <Task pending name='Task-100' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:921> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> is being executed.
Exception in callback <TaskStepMethWrapper object at 0x7f0d5444e920>()
handle: <Handle <TaskStepMethWrapper object at 0x7f0d5444e920>()>
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
RuntimeError: Cannot enter into task <Task pending name='Task-102' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:917> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> while another task <Task pending name='Task-100' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:921> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> is being executed.
Exception in callback Task.task_wakeup(<Future finished result=None>)
handle: <Handle Task.task_wakeup(<Future finished result=None>)>
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
RuntimeError: Cannot enter into task <Task pending name='Task-99' coro=<MultiKernelManager._add_kernel_when_ready() running at /opt/conda/lib/python3.10/site-packages/jupyter_client/multikernelmanager.py:173> wait_for=<Future finished result=None> cb=[Task.task_wakeup()]> while another task <Task pending name='Task-100' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:921> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> is being executed.
Exception in callback <TaskStepMethWrapper object at 0x7f0d5444e4a0>()
handle: <Handle <TaskStepMethWrapper object at 0x7f0d5444e4a0>()>
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
RuntimeError: Cannot enter into task <Task pending name='Task-103' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:917> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> while another task <Task pending name='Task-100' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:921> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> is being executed.
Exception in callback <TaskStepMethWrapper object at 0x7f0d5444c4c0>()
handle: <Handle <TaskStepMethWrapper object at 0x7f0d5444c4c0>()>
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
RuntimeError: Cannot enter into task <Task pending name='Task-104' coro=<RequestHandler._execute() running at /opt/conda/lib/python3.10/site-packages/tornado/web.py:1665> cb=[_HandlerDelegate.execute.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/web.py:2361]> while another task <Task pending name='Task-100' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:921> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> is being executed.
Exception in callback Task.task_wakeup(<Task finishe...> result=None>)
handle: <Handle Task.task_wakeup(<Task finishe...> result=None>)>
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
RuntimeError: Cannot enter into task <Task pending name='Task-93' coro=<RequestHandler._execute() running at /opt/conda/lib/python3.10/site-packages/tornado/web.py:1713> wait_for=<Task finished name='Task-97' coro=<MultiKernelManager._add_kernel_when_ready() done, defined at /opt/conda/lib/python3.10/site-packages/jupyter_client/multikernelmanager.py:169> result=None> cb=[_HandlerDelegate.execute.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/web.py:2361]> while another task <Task pending name='Task-100' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:921> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> is being executed.
Exception in callback Task.task_wakeup(<Task finishe...> result=None>)
handle: <Handle Task.task_wakeup(<Task finishe...> result=None>)>
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
RuntimeError: Cannot enter into task <Task pending name='Task-94' coro=<RequestHandler._execute() running at /opt/conda/lib/python3.10/site-packages/tornado/web.py:1713> wait_for=<Task finished name='Task-98' coro=<MultiKernelManager._add_kernel_when_ready() done, defined at /opt/conda/lib/python3.10/site-packages/jupyter_client/multikernelmanager.py:169> result=None> cb=[_HandlerDelegate.execute.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/web.py:2361]> while another task <Task pending name='Task-100' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:921> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]> is being executed.
Traceback (most recent call last):
File "/opt/conda/bin/jupyter-lab", line 10, in <module>
sys.exit(main())
File "/opt/conda/lib/python3.10/site-packages/jupyter_server/extension/application.py", line 594, in launch_instance
serverapp.start()
File "/opt/conda/lib/python3.10/site-packages/jupyter_server/serverapp.py", line 2815, in start
self.start_ioloop()
File "/opt/conda/lib/python3.10/site-packages/jupyter_server/serverapp.py", line 2801, in start_ioloop
self.io_loop.start()
File "/opt/conda/lib/python3.10/site-packages/tornado/platform/asyncio.py", line 215, in start
self.asyncio_loop.run_forever()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 600, in run_forever
self._run_once()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1881, in _run_once
handle = self._ready.popleft()
IndexError: pop from an empty deque
/opt/conda/lib/python3.10/configparser.py:1250: RuntimeWarning: coroutine 'PeriodicCallback._run' was never awaited
getter = functools.partial(self.get, _impl=getattr(parser, key))
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Task was destroyed but it is pending!
task: <Task pending name='Task-101' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:917> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]>
Task was destroyed but it is pending!
task: <Task pending name='Task-102' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:917> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]>
Task was destroyed but it is pending!
task: <Task pending name='Task-103' coro=<PeriodicCallback._run() running at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:917> cb=[IOLoop.add_future.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/ioloop.py:687]>
/opt/conda/lib/python3.10/configparser.py:1250: RuntimeWarning: coroutine 'RequestHandler._execute' was never awaited
getter = functools.partial(self.get, _impl=getattr(parser, key))
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Task was destroyed but it is pending!
task: <Task pending name='Task-104' coro=<RequestHandler._execute() running at /opt/conda/lib/python3.10/site-packages/tornado/web.py:1665> cb=[_HandlerDelegate.execute.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/web.py:2361]>
Task was destroyed but it is pending!
task: <Task pending name='Task-93' coro=<RequestHandler._execute() running at /opt/conda/lib/python3.10/site-packages/tornado/web.py:1713> wait_for=<Task finished name='Task-97' coro=<MultiKernelManager._add_kernel_when_ready() done, defined at /opt/conda/lib/python3.10/site-packages/jupyter_client/multikernelmanager.py:169> result=None> cb=[_HandlerDelegate.execute.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/web.py:2361]>
Task was destroyed but it is pending!
task: <Task pending name='Task-94' coro=<RequestHandler._execute() running at /opt/conda/lib/python3.10/site-packages/tornado/web.py:1713> wait_for=<Task finished name='Task-98' coro=<MultiKernelManager._add_kernel_when_ready() done, defined at /opt/conda/lib/python3.10/site-packages/jupyter_client/multikernelmanager.py:169> result=None> cb=[_HandlerDelegate.execute.<locals>.<lambda>() at /opt/conda/lib/python3.10/site-packages/tornado/web.py:2361]>
Expected behavior
Workspace with multiple notebooks running as was the case before the container was restarted.
Actual behavior
Environment is unusable. None of the kernels start the second time.
Anything else?
Issues seems to be resolved with the following modifications to the base image. Best tested with a new bind mount volume.
FROM jupyter/base-notebook:lab-3.4.4
RUN mamba install --quiet --yes \
'tornado==6.1.0' \
'nest-asyncio==1.5.1' && \
mamba clean --all -f -y && \
fix-permissions "${CONDA_DIR}" && \
fix-permissions "/home/${NB_USER}"
Why do you think tornado and nest-asyncio fixes the issue? Do later version of tornado and nest-asyncio work as well? Thx
It does not fix the issue, but it makes it much less common. Only a couple of percent of launches will fail after using the versions specified. This is the simplest reproducible case that I could create for this issue. Is it something that you can also reproduce?
I believe it is probably related to the following issue https://github.com/jupyterlab/jupyterlab/issues/11934
This is a jupyterlab issue which will be fixed in the future. If you need it sooner please submit a jupyterlab PR based on this issue.
There seems to be a PR for this issue already.
Thank you @Bidek56. We will wait for the upstream fix then.
@minrk @blink1073 @kevin-bates could you please take a look here (and maybe to the attached PR)?
We're are going to remove the use of nest-asyncio entirely in client 8: https://github.com/jupyter/jupyter_client/issues/833. There is already a PR merged that does this: https://github.com/jupyter/jupyter_client/pull/835, and an alpha release.
Great news @blink1073, thank you! We will be waiting for release then.
Upstream issue was resolved, and the latest Jupyter-client doesn't have nest-asyncio.
I have also checked the issue you have (I was having it as well) and it seems to work fine now.