Duix.Heygem icon indicating copy to clipboard operation
Duix.Heygem copied to clipboard

在Ubuntu系统部署,heygem-f2f容器报错:RuntimeError: Found no NVIDIA driver on your system

Open gavid0124 opened this issue 9 months ago • 6 comments

完整报错信息如下: 2025-03-24 11:46:44] [app_local.py[line:230]] [WARNING] [ -> 服务不进行注册] [2025-03-24 11:46:44] [app_local.py[line:231]] [INFO] [TransDhTask init] Traceback (most recent call last): File "/code/app_local.py", line 231, in TransDhTask.instance() File "trans_dh_service.py", line 1207, in trans_dh_service.TransDhTask.instance File "trans_dh_service.py", line 1189, in trans_dh_service.TransDhTask.init File "compute_ctc_att_bnf.py", line 130, in compute_ctc_att_bnf.load_ppg_model File "/usr/local/python3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1152, in to return self._apply(convert) File "/usr/local/python3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 802, in _apply module._apply(fn) File "/usr/local/python3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 802, in _apply module._apply(fn) File "/usr/local/python3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 802, in _apply module._apply(fn) [Previous line repeated 1 more time] File "/usr/local/python3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 825, in _apply param_applied = fn(param) File "/usr/local/python3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1150, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) File "/usr/local/python3/lib/python3.8/site-packages/torch/cuda/init.py", line 302, in _lazy_init torch._C._cuda_init() RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

https://github.com/GuijiAI/HeyGem.ai/issues/158 试过这个,没成功。

也试过重新拉取镜像,也不行,命令:docker-compose pull && docker-compose up -d

系统是Ubuntu 22.04.5,显卡是8张4090。

gavid0124 avatar Mar 24 '25 03:03 gavid0124

请问安装成功了吗? ubuntu

qiumiao1988 avatar Mar 24 '25 15:03 qiumiao1988

请问安装成功了吗? ubuntu

heygem-f2f容器不行,起不来。

gavid0124 avatar Mar 25 '25 01:03 gavid0124

  1. Execute the nvidia-smi command to confirm whether the graphics card driver is installed.
  2. Install the NVIDIA Container Toolkit. The NVIDIA Container Toolkit is a necessary tool for Docker to use NVIDIA GPUs.

whl88 avatar Mar 25 '25 06:03 whl88

  1. Execute the nvidia-smi command to confirm whether the graphics card driver is installed.
  2. Install the NVIDIA Container Toolkit. The NVIDIA Container Toolkit is a necessary tool for Docker to use NVIDIA GPUs.

这几个肯定都有的,都是最基本的。

gavid0124 avatar Mar 25 '25 12:03 gavid0124

我也是这个问题,nvidia-smi没问题,其他两个运行正常

blessing-gao avatar Mar 28 '25 08:03 blessing-gao

不要使用docker desktop(win版),看似nvidia-smi成功,其实用torch都会出现torch.cuda.is_available() 为false,直接使用wsl ubuntu 安装docker(具体参照百度),还要安装cuda toolkit,nvidia-container-tookit,然后再拉取f2 image就会成功识别gpu

yuanli-wan avatar Apr 01 '25 08:04 yuanli-wan