dorado
dorado copied to clipboard
dorado_basecall_server --version hangs indefinitely and MinKNOW installation fails
Issue Report
Please describe the issue:
During installation of MinKNOW or restarting the computer,
/opt/ont/dorado/bin/dorado_basecall_server --version
hangs indefinitely and the minknow service does not start.
Steps to reproduce the issue:
System information: Distributor ID: Ubuntu Description: Ubuntu 20.04.4 LTS Release: 20.04 Codename: focal
During a clean install of minknow-gpu-release, the installation will hang awaitingdorado --version
. For example:
apt install ont-standalone-minknow-gpu-release
Run environment:
- Dorado version:
$ /opt/ont/dorado/bin/dorado_basecall_server --version
hangs indefinitely. Version is 7.3.9
- Dorado command:
$ /opt/ont/dorado/bin/dorado_basecall_server --version
- Operating system: Ubuntu 20.04
- Hardware (CPUs, Memory, GPUs): 20 CPUs, 128 G RAM, RTX5000 GPU
Logs
Selecting previously unselected package ont-doradod-for-minion.
Preparing to unpack .../11-ont-doradod-for-minion_7.3.9-1~focal_all.deb ...
Unpacking ont-doradod-for-minion (7.3.9-1~focal) ...
Selecting previously unselected package ont-kingfisher-ui-minion.
Preparing to unpack .../12-ont-kingfisher-ui-minion_5.9.17-1~focal_all.deb ...
Unpacking ont-kingfisher-ui-minion (5.9.17-1~focal) ...
Selecting previously unselected package ont-run-report.
Preparing to unpack .../13-ont-run-report_5.9.6_amd64.deb ...
Unpacking ont-run-report (5.9.6) ...
Selecting previously unselected package ont-vbz-hdf-plugin.
Preparing to unpack .../14-ont-vbz-hdf-plugin_1.0.8-1~focal_amd64.deb ...
Unpacking ont-vbz-hdf-plugin (1.0.8-1~focal) ...
Selecting previously unselected package ont-standalone-minknow-gpu-release.
Preparing to unpack .../15-ont-standalone-minknow-gpu-release_24.02.10~focal_amd64.deb ...
Unpacking ont-standalone-minknow-gpu-release (24.02.10~focal) ...
Setting up libnorm1:amd64 (1.5.8+dfsg2-2build1) ...
Setting up ont-python (3.10.13-0) ...
Setting up ont-run-report (5.9.6) ...
Setting up ont-dorado-models-for-minion (7.3.9-1) ...
Setting up libdbusmenu-gtk4:amd64 (16.04.1+18.10.20180917-0ubuntu6) ...
Setting up ont-vbz-hdf-plugin (1.0.8-1~focal) ...
Setting up libpgm-5.2-0:amd64 (5.2.122~dfsg-3ubuntu1) ...
Setting up libzmq5:amd64 (4.3.2-2ubuntu1) ...
Setting up minknow-core-minion-nc (5.9.7) ...
<<< Hangs here until timeout received >>>
Job for minknow.service failed because a timeout was exceeded.
See "systemctl status minknow.service" and "journalctl -xe" for details.
Setting up ont-dorado-server-for-minion (7.3.9-1~focal) ...
Setting up libappindicator1 (12.10.1+20.04.20200408.1-0ubuntu1) ...
Setting up ont-doradod-for-minion (7.3.9-1~focal) ...
Created symlink /etc/systemd/system/doradod.service → /lib/systemd/system/doradod.service.
Created symlink /etc/systemd/system/multi-user.target.wants/doradod.service → /lib/systemd/system/doradod.service.
Setting up ont-kingfisher-ui-minion (5.9.17-1~focal) ...
Setting up ont-bream4-minion (7.9.4-1~focal) ...
Setting up ont-configuration-customer-minion (5.9.12-1~focal) ...
Setting up ont-standalone-minknow-gpu-release (24.02.10~focal) ...
Job for minknow.service failed because a timeout was exceeded.
See "systemctl status minknow.service" and "journalctl -xe" for details.
Processing triggers for libc-bin (2.31-0ubuntu9.15) ...
/sbin/ldconfig.real: /opt/ont/dorado/lib/libnvToolsExt.so.1 is not a symbolic link
Processing triggers for desktop-file-utils (0.24-1ubuntu3) ...
Processing triggers for mime-support (3.64ubuntu1) ...
Processing triggers for hicolor-icon-theme (0.17-2) ...
Processing triggers for gnome-menus (3.36.0-1ubuntu1) ...
Output from systemctl status minknow.service
$ systemctl status minknow.service
● minknow.service - MinKNOW Instrument Software for MinION (daemon)
Loaded: loaded (/lib/systemd/system/minknow.service; enabled; vendor preset: enabled)
Active: failed (Result: timeout) since Tue 2024-04-23 16:05:51 ACST; 14min ago
Process: 4285 ExecStartPre=/bin/sleep 15 (code=exited, status=0/SUCCESS)
Process: 4289 ExecStart=/opt/ont/minknow/bin/mk_manager_svc (code=killed, signal=TERM)
Main PID: 4289 (code=killed, signal=TERM)
Tasks: 0 (limit: 153937)
Memory: 4.0K
CGroup: /system.slice/minknow.service
Apr 23 16:04:06 7pzlxr3-l2 systemd[1]: Starting MinKNOW Instrument Software for MinION (daemon)...
Apr 23 16:05:51 7pzlxr3-l2 systemd[1]: minknow.service: start operation timed out. Terminating.
Apr 23 16:05:51 7pzlxr3-l2 systemd[1]: minknow.service: Killing process 4292 (dorado_basecall) with signal SIGKILL.
Apr 23 16:05:51 7pzlxr3-l2 systemd[1]: minknow.service: Failed with result 'timeout'.
Apr 23 16:05:51 7pzlxr3-l2 systemd[1]: Failed to start MinKNOW Instrument Software for MinION (daemon).
Please note also Issue #390 is not resolved, per the lines in bold
Hi @linsalrob ,
Thanks for the bug report - this is an odd one that we've not seen before. Can I ask you to try a couple of diagnostic things so we can try to narrow down what's going on:
- If you run
systemctl status dorado
what does it report? - Is there anything in the /var/log/dorado folder?
- What does
nvidia-smi
report? - If you download a standalone archive of dorado basecall server 7.3.9 from https://community.nanoporetech.com/downloads and try to run that with
--version
does it also hang? (it can take up to a minute for the server to start the first time you run it, as the executable is very large, but it obviously shouldn't hang indefinitely). - If that does hang, can you try installing the standalone dorado command line tools from https://github.com/nanoporetech/dorado and see if you can run
dorado --version
, and let me know what it does.
Thanks, Mark
Closing as stale - please re-open if needed