Python hangs after creating too many RichJupyterWidget instances
I am running into an issue in glue when running tests on CircleCI - it turns out that after creating too many RichJupyterWidget instances, the Python process just hangs. The following code run on CircleCI demonstrates the issue (showing that this is not a glue issue as such):
from qtpy.QtWidgets import QApplication
from qtconsole.inprocess import QtInProcessKernelManager
from qtconsole.rich_jupyter_widget import RichJupyterWidget
app = QApplication([''])
kernel_manager = QtInProcessKernelManager()
kernel_manager.start_kernel()
kernel_client = kernel_manager.client()
kernel_client.start_channels()
for i in range(100):
control = RichJupyterWidget()
control.kernel_manager = kernel_manager
control.kernel_client = kernel_client
control.shell = kernel_manager.kernel.shell
The Python process hangs after anywhere between 20 and 80 iterations of the loop. Here is the output of pip freeze in case you need more details
Output of pip freeze
astrodendro==0.2.0
astropy==2.0
backports-abc==0.5
certifi==2017.4.17
chardet==3.0.4
command-not-found==0.3
cycler==0.10.0
decorator==4.1.1
dill==0.2.7.1
glue-core==0.11.0.dev251
h5py==2.7.0
idna==2.5
ipykernel==4.6.1
ipython==6.1.0
ipython-genutils==0.2.0
jedi==0.10.2
jsonschema==2.6.0
jupyter-client==5.1.0
jupyter-core==4.3.0
language-selector==0.1
matplotlib==2.0.2
mock==1.0.1
nbformat==4.3.0
networkx==1.11
numpy==1.13.1
olefile==0.44
pandas==0.20.3
pexpect==4.2.1
pickleshare==0.7.4
Pillow==4.2.1
plotly==2.0.12
prompt-toolkit==1.0.14
ptyprocess==0.5.2
py==1.4.34
PyAVM==0.9.2
pycurl==7.19.3
Pygments==2.2.0
pygobject==3.12.0
pyparsing==2.2.0
pytest==3.1.3
pytest-repeat==0.4.1
python-apt===0.9.3.5ubuntu2
python-dateutil==2.6.1
pytz==2017.2
PyWavelets==0.5.2
pyzmq==16.0.2
qtconsole==4.3.0
QtPy==1.2.1
requests==2.18.1
scikit-image==0.13.0
scipy==0.19.1
simplegeneric==0.8.1
six==1.10.0
spectral-cube==0.4.0
tornado==4.5.1
traitlets==4.3.2
typing==3.6.1
ufw===0.34-rc-0ubuntu2
unattended-upgrades==0.1
urllib3==1.21.1
wcwidth==0.1.7
xlrd==1.0.0
And the following packages were installed using apt-get:
python3 python3-setuptools python3-pytest python3-mock python3-pyqt5 python3-pyqt5.qtsvg python3-tk python3-dev
UPDATE - I've set up a simple docker container that can be used to reproduce the issue, see here for more details.
If I run with -m trace --trace, buried in the output I saw the following error, but I'm not sure if it's related?
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/zmq/utils/garbage.py", line 114, in _atexit
garbage.py(55): del tup
self.stop()
garbage.py(45): if getpid is None or getpid() != self.pid:
File "/usr/local/lib/python3.4/dist-packages/zmq/utils/garbage.py", line 120, in stop
garbage.py(47): msg = s.recv()
self._stop()
garbage.py(48): if msg == b'DIE':
File "/usr/local/lib/python3.4/dist-packages/zmq/utils/garbage.py", line 123, in _stop
garbage.py(50): fmt = 'L' if len(msg) == 4 else 'Q'
push = self.context.socket(zmq.PUSH)
garbage.py(51): key = struct.unpack(fmt, msg)[0]
File "/usr/local/lib/python3.4/dist-packages/zmq/sugar/context.py", line 146, in socket
garbage.py(52): tup = self.gc.refs.pop(key, None)
garbage.py(53): if tup and tup.event:
s = self._socket_class(self, socket_type, **kwargs)
garbage.py(55): del tup
File "zmq/backend/cython/socket.pyx", line 285, in zmq.backend.cython.socket.Socket.__cinit__ (zmq/backend/cython/socket.c:3861)
garbage.py(45): if getpid is None or getpid() != self.pid:
garbage.py(47): msg = s.recv()
garbage.py(48): if msg == b'DIE':
garbage.py(50): fmt = 'L' if len(msg) == 4 else 'Q'
garbage.py(51): key = struct.unpack(fmt, msg)[0]
zmq.error.ZMQError: Too many open files
However, I'm not sure if this is necessarily related. When I run in this mode I can open far more terminals before python hangs.
Also, if I run subprocess.call('lsof | wc -l', shell=True) in the loop, the number of files stays constant, so that might be a red herring.
I should add that I still see the issue if I close the widget inside the loop:
for i in range(1000):
print(i)
control = RichJupyterWidget()
control.kernel_manager = kernel_manager
control.kernel_client = kernel_client
control.shell = kernel_manager.kernel.shell
control.close()
Are you killing the kernels after each test finish? Opening too many kernels could lead to the error you posted above about too many files open files and also fill the container memory completely.
@ccordoba12 - I'm not killing the kernels after each test, I'm re-using the same one as in the simple code in this issue (the kernel manager and client are stored in a global variable). I think the ZMQ error is unrelated to the original issue here.
Are you able to reproduce the issue with the code here?
Sorry if I wasn't clear, what I mean is that in the example above I am re-using the same QtInProcessKernelManager inside the loop, so only one is created. The way the short code is structured at the start of this issue replicates how I deal with things in tests.
I've set up a docker image that can be used to reproduce the issue. To use:
docker run -it astrofrog/debug-qtconsole-226:1 /bin/bash
then, in the container:
xvfb-run python3 test.py
Importantly, at the point where it hangs, the CPU activity goes to zero and the memory is only at 4%. So it seems to be waiting for something.