the-littlest-jupyterhub
the-littlest-jupyterhub copied to clipboard
JupyterLab Server of user crashes
Bug description
At some point and several times per day the server of some users crashes, with the following error in journalctl of jupyter-username:
Mai 04 08:39:25 jupyterhubvm systemd[1]: Started /bin/bash -c cd /home/jupyter-cyril && exec jupyterhub-singleuser --port=38773 --SingleUserNotebookApp.default_url=/lab.
Mai 04 08:39:26 jupyterhubvm bash[14907]: [I 2022-05-04 08:39:26.512 SingleUserNotebookApp notebookapp:1593] Authentication of /metrics is OFF, since other authentication is disabled.
Mai 04 08:39:27 jupyterhubvm bash[14907]: [I 2022-05-04 08:39:27.378 LabApp] JupyterLab extension loaded from /opt/tljh/user/lib/python3.9/site-packages/jupyterlab
Mai 04 08:39:27 jupyterhubvm bash[14907]: [I 2022-05-04 08:39:27.378 LabApp] JupyterLab application directory is /opt/tljh/user/share/jupyter/lab
Mai 04 08:39:27 jupyterhubvm bash[14907]: /opt/tljh/user/lib/python3.9/site-packages/jupyter_server_mathjax/app.py:40: FutureWarning: The alias `_()` will be deprecated. Use `_i18n()` instead.
Mai 04 08:39:27 jupyterhubvm bash[14907]: help=_("""The MathJax.js configuration file that is to be used."""),
Mai 04 08:39:27 jupyterhubvm bash[14907]: [W 2022-05-04 08:39:27.510 SingleUserNotebookApp notebookapp:2034] Error loading server extension nbresuse
Mai 04 08:39:27 jupyterhubvm bash[14907]: Traceback (most recent call last):
Mai 04 08:39:27 jupyterhubvm bash[14907]: File "/opt/tljh/user/lib/python3.9/site-packages/notebook/notebookapp.py", line 2030, in init_server_extensions
Mai 04 08:39:27 jupyterhubvm bash[14907]: func(self)
Mai 04 08:39:27 jupyterhubvm bash[14907]: File "/opt/tljh/user/lib/python3.9/site-packages/nbresuse/__init__.py", line 49, in load_jupyter_server_extension
Mai 04 08:39:27 jupyterhubvm bash[14907]: PrometheusHandler(PSUtilMetricsLoader(nbapp)), 1000
Mai 04 08:39:27 jupyterhubvm bash[14907]: File "/opt/tljh/user/lib/python3.9/site-packages/nbresuse/prometheus.py", line 25, in __init__
Mai 04 08:39:27 jupyterhubvm bash[14907]: gauge = Gauge(phrase, "counter for " + phrase.replace("_", " "), [])
Mai 04 08:39:27 jupyterhubvm bash[14907]: File "/opt/tljh/user/lib/python3.9/site-packages/prometheus_client/metrics.py", line 355, in __init__
Mai 04 08:39:27 jupyterhubvm bash[14907]: super(Gauge, self).__init__(
Mai 04 08:39:27 jupyterhubvm bash[14907]: File "/opt/tljh/user/lib/python3.9/site-packages/prometheus_client/metrics.py", line 136, in __init__
Mai 04 08:39:27 jupyterhubvm bash[14907]: registry.register(self)
Mai 04 08:39:27 jupyterhubvm bash[14907]: File "/opt/tljh/user/lib/python3.9/site-packages/prometheus_client/registry.py", line 29, in register
Mai 04 08:39:27 jupyterhubvm bash[14907]: raise ValueError(
Mai 04 08:39:27 jupyterhubvm bash[14907]: ValueError: Duplicated timeseries in CollectorRegistry: {'total_memory_usage'}
Expected behaviour
No crash
Actual behaviour
Crash and restart of server & kernel required Important: It is independent of ram usage, even after fresh reboot.
How to reproduce
Hard to reproduce, just waiting
Your personal set up
- OS: ubuntu 18.04
- Version(s): latest TLJH version, Proxmox VM on Intel Xeon E5-2620 v4, 256GB RAM
Full environment
asn1crypto==0.24.0
attrs==17.4.0
Automat==0.6.0
bcrypt==3.2.0
blinker==1.4
cached-property==1.5.2
certifi==2018.1.18
cffi==1.15.0
chardet==3.0.4
charset-normalizer==2.0.9
click==6.7
cloud-init==22.1
colorama==0.3.7
command-not-found==0.3
configobj==5.0.6
constantly==15.1.0
cryptography==36.0.0
distro==1.6.0
distro-info===0.18ubuntu0.18.04.1
docker==5.0.3
docker-compose==1.29.2
dockerpty==0.4.1
docopt==0.6.2
httplib2==0.9.2
hyperlink==17.3.1
idna==2.6
incremental==16.10.1
iotop==0.6
Jinja2==2.10
jsonpatch==1.16
jsonpointer==1.10
jsonschema==2.6.0
keyring==10.6.0
keyrings.alt==3.0
language-selector==0.1
MarkupSafe==1.0
netifaces==0.10.4
numpy==1.19.5
oauthlib==2.0.6
PAM==0.4.2
paramiko==2.8.1
pexpect==4.2.1
pyasn1==0.4.2
pyasn1-modules==0.2.1
pycparser==2.21
pycrypto==2.6.1
PyGObject==3.26.1
PyJWT==1.5.3
PyNaCl==1.4.0
pyOpenSSL==17.5.0
pyserial==3.4
python-apt==1.6.5+ubuntu0.7
python-debian==0.1.32
python-dotenv==0.19.2
pyxdg==0.25
PyYAML==3.12
requests==2.26.0
requests-unixsocket==0.1.5
SecretStorage==2.3.1
semantic-version==2.8.5
service-identity==16.0.0
six==1.11.0
sos==4.3
ssh-import-id==5.7
systemd-python==234
texttable==1.6.4
Twisted==17.9.0
typing_extensions==4.0.1
ubuntu-advantage-tools==27.7
ufw==0.36
unattended-upgrades==0.1
urllib3==1.22
websocket-client==0.59.0
zope.interface==4.3.2
Configuration
users:
admin:
- agrigor
user_environment:
default_app: jupyterlab
services:
configurator:
enabled: false
auth:
type: nativeauthenticator.NativeAuthenticator
NativeAuthenticator:
open_signup: true
Logs
Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:
ping :)
@Agrigor sorry, it's been a busy couple of months! It looks like you have two packages publishing the same metrics: the older nbresuse
, and the newer jupyter-resource-usage
. I think if you remove nbresuse
, you should get what you want.
He @minrk, thanks for your answer! I just uninstalled nbresuse, but unfortunately the crashes are still existing all the time ... Do you have any other idea how I can debug or even fix this crash issue? KR
@Agrigor sorry, it's been a busy couple of months! It looks like you have two packages publishing the same metrics: the older
nbresuse
, and the newerjupyter-resource-usage
. I think if you removenbresuse
, you should get what you want.
thanks, when i start jupyterlab after uninstall nbresuse,it solved my problem like:ValueError: Duplicated timeseries in CollectorRegistry: {'total_memory_usage'}