lmdeploy
lmdeploy copied to clipboard
[Bug] ImportError: DLL load failed while importing _turbomind: 找不到指定的模块。
Checklist
- [ ] 1. I have searched related issues but cannot get the expected help.
- [ ] 2. The bug has not been fixed in the latest version.
Describe the bug
pip直接安装,因为triton的原因,最后自动装上了lmdeploy 0.2.2
Reproduction
PS G:\Python-3.9.12\Lib\site-packages\lmdeploy> lmdeploy serve api_server internlm/internlm-xcomposer2-vl-1_8b --server-port 23333
Traceback (most recent call last):
File "G:\Python-3.9.12\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "G:\Python-3.9.12\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "G:\Python-3.9.12\Scripts\lmdeploy.exe\__main__.py", line 7, in <module>
sys.exit(run())
File "G:\Python-3.9.12\lib\site-packages\lmdeploy\cli\entrypoint.py", line 18, in run
args.run(args)
File "G:\Python-3.9.12\lib\site-packages\lmdeploy\cli\serve.py", line 248, in api_server
run_api_server(args.model_path,
File "G:\Python-3.9.12\lib\site-packages\lmdeploy\serve\openai\api_server.py", line 994, in serve
VariableInterface.async_engine = AsyncEngine(
File "G:\Python-3.9.12\lib\site-packages\lmdeploy\serve\async_engine.py", line 67, in __init__
self._build_turbomind(model_path=model_path,
File "G:\Python-3.9.12\lib\site-packages\lmdeploy\serve\async_engine.py", line 107, in _build_turbomind
from lmdeploy import turbomind as tm
File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\__init__.py", line 24, in <module>
from .turbomind import TurboMind # noqa: E402
File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\turbomind.py", line 26, in <module>
from .deploy.converter import (get_model_format, supported_formats,
File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\deploy\converter.py", line 16, in <module>
from .target_model.base import OUTPUT_MODELS, TurbomindModelConfig
File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\deploy\target_model\__init__.py", line 3, in <module>
from .w4 import TurbomindW4Model # noqa: F401
File "G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\deploy\target_model\w4.py", line 17, in <module>
import _turbomind as _tm # noqa: E402
ImportError: DLL load failed while importing _turbomind: 找不到指定的模块。
Environment
win11
PS G:\Python-3.9.12\Lib\site-packages\lmdeploy> lmdeploy check_env
sys.platform: win32
Python: 3.9.12 (tags/v3.9.12:b28265d, Mar 23 2022, 23:52:46) [MSC v.1929 64 bit (AMD64)]
CUDA available: False
MUSA available: False
numpy_random_seed: 2147483648
GCC: n/a
PyTorch: 2.3.0+cpu
PyTorch compiling details: PyTorch built with:
- C++ Version: 201703
- MSVC 192930151
- Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v3.3.6 (Git Hash 86e6af5974177e513fd3fee58425e1063e7f1361)
- OpenMP 2019
- LAPACK is enabled (usually provided by MKL)
- CPU capability usage: AVX512
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /Zc:__cplusplus /bigobj /FS /utf-8 -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE /wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 /wd4804 /wd4273, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.3.0, USE_CUDA=0, USE_CUDNN=OFF, USE_CUSPARSELT=OFF, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,
TorchVision: 0.18.0+cpu
LMDeploy: 0.2.2+
transformers: 4.40.1
gradio: Not Found
fastapi: 0.110.2
pydantic: 2.7.1
环境变量 ··· PS G:\Python-3.9.12\Lib\site-packages\lmdeploy> echo $env:PATH C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\libnvvp;G:\Python-3.9.12\Scripts;G:\Python-3.9.12;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files\NVIDIA Corporation\NVIDIA NvDLISR;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0;C:\WINDOWS\System32\OpenSSH;C:\Program Files\Git\cmd;C:\Program Files\NVIDIA Corporation\Nsight Compute 2022.3.0;C:\Users\marti\AppData\Local\Microsoft\WindowsApps;G:\Microsoft VS Code\bin ···
nvidia-smi
PS G:\Python-3.9.12\Lib\site-packages\lmdeploy> nvidia-smi
Tue Apr 30 12:32:00 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 522.06 Driver Version: 522.06 CUDA Version: 11.8 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... WDDM | 00000000:01:00.0 Off | N/A |
| N/A 57C P0 30W / N/A | 6MiB / 8192MiB | 2% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
### Error traceback
_No response_
重装cuda到12.4,重装pytorch还是不行
··· PS C:\Users\marti> lmdeploy check_env sys.platform: win32 Python: 3.9.12 (tags/v3.9.12:b28265d, Mar 23 2022, 23:52:46) [MSC v.1929 64 bit (AMD64)] CUDA available: True MUSA available: False numpy_random_seed: 2147483648 GPU 0: NVIDIA GeForce RTX 3070 Laptop GPU CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4 NVCC: Cuda compilation tools, release 12.4, V12.4.131 GCC: n/a PyTorch: 2.3.0+cu121 PyTorch compiling details: PyTorch built with:
- C++ Version: 201703
- MSVC 192930151
- Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v3.3.6 (Git Hash 86e6af5974177e513fd3fee58425e1063e7f1361)
- OpenMP 2019
- LAPACK is enabled (usually provided by MKL)
- CPU capability usage: AVX512
- CUDA Runtime 12.1
- NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
- CuDNN 8.8.1 (built against CUDA 12.0)
- Magma 2.5.4
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.8.1, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /Zc:__cplusplus /bigobj /FS /utf-8 -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE /wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 /wd4804 /wd4273, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.3.0, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=OFF, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,
TorchVision: 0.18.0+cpu LMDeploy: 0.2.2+ transformers: 4.40.1 gradio: Not Found fastapi: 0.110.2 pydantic: 2.7.1 ···
报错仍然是:ImportError: DLL load failed while importing _turbomind: 找不到指定的模块。
@irexyc @lvhan028 两位前辈,顺带一提,lmdeploy什么时候可以像xtuner一样有一个自己的交流群呀(而不是使用internlm2的...
@vansin 帮忙建立 lmdeploy 专门的用户群吧
LMDeploy 装最新的吧,别装0.2.2了。
你的显卡驱驱动支持到 11.8,所以你不能装 pypi 上面的包(因为那个是用cuda12编译的),除非你升级显卡驱动。
新建一个虚拟环境
pip install https://github.com/InternLM/lmdeploy/releases/download/v0.4.0/lmdeploy-0.4.0+cu118-cp39-cp39-win_amd64.whl --extra-index-url https://download.pytorch.org/whl/cu118
然后再看看能不能运行,不能运行的话,启动的时候应该会print 一条信息,分享一下。
C:\Users\marti>pip install https://github.com/InternLM/lmdeploy/releases/download/v0.4.0/lmdeploy-0.4.0+cu118-cp39-cp39-win_amd64.whl --extra-index-url https://download.pytorch.org/whl/cu118 Defaulting to user installation because normal site-packages is not writeable Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu118 ERROR: lmdeploy-0.4.0+cu118-cp39-cp39-win_amd64.whl is not a supported wheel on this platform.报错了
你看下你python的版本,选择一个对应的。建议你用conda
这个地址是3.9的。 https://github.com/InternLM/lmdeploy/releases/download/v0.4.0/lmdeploy-0.4.0+cu118-cp39-cp39-win_amd64.whl
谢谢大佬
C:\Users\marti>pip install https://github.com/InternLM/lmdeploy/releases/download/v0.4.0/lmdeploy-0.4.0+cu118-cp311-cp311-win_amd64.whl Defaulting to user installation because normal site-packages is not writeable Collecting lmdeploy==0.4.0+cu118 Downloading https://github.com/InternLM/lmdeploy/releases/download/v0.4.0/lmdeploy-0.4.0+cu118-cp311-cp311-win_amd64.whl (52.7 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 52.7/52.7 MB 19.9 MB/s eta 0:00:00 Collecting einops (from lmdeploy==0.4.0+cu118) Using cached einops-0.8.0-py3-none-any.whl.metadata (12 kB) Requirement already satisfied: fastapi in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (0.110.3) Requirement already satisfied: fire in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (0.6.0) Requirement already satisfied: mmengine-lite in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (0.10.4) Requirement already satisfied: numpy in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (1.26.4) Collecting peft<=0.9.0 (from lmdeploy==0.4.0+cu118) Using cached peft-0.9.0-py3-none-any.whl.metadata (13 kB) Collecting pillow (from lmdeploy==0.4.0+cu118) Using cached pillow-10.3.0-cp311-cp311-win_amd64.whl.metadata (9.4 kB) Requirement already satisfied: protobuf in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (4.25.3) Requirement already satisfied: pydantic>2.0.0 in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (2.7.1) Requirement already satisfied: pynvml in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (11.5.0) Requirement already satisfied: safetensors in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (0.4.3) Requirement already satisfied: sentencepiece in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (0.2.0) Requirement already satisfied: shortuuid in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (1.0.13) Requirement already satisfied: tiktoken in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (0.6.0) Collecting torch<=2.2.2,>=2.0.0 (from lmdeploy==0.4.0+cu118) Using cached torch-2.2.2-cp311-cp311-win_amd64.whl.metadata (26 kB) Requirement already satisfied: transformers in c:\users\marti\appdata\roaming\python\python311\site-packages (from lmdeploy==0.4.0+cu118) (4.40.1) INFO: pip is looking at multiple versions of lmdeploy to determine which version is compatible with other requirements. This could take a while. ERROR: Could not find a version that satisfies the requirement triton<=2.2.0,>=2.1.0 (from lmdeploy) (from versions: none) ERROR: No matching distribution found for triton<=2.2.0,>=2.1.0为什么还是这样啊
triton 不支持 windows,你可以这么装。
# step 1, install lmdeploy without deps
pip install https://github.com/InternLM/lmdeploy/releases/download/v0.4.0/lmdeploy-0.4.0+cu118-cp311-cp311-win_amd64.whl --no-deps
# step 2, install the deps (exclude triton)
# 新建一个文件 requirements.txt
# 内容为
einops
fastapi
fire
mmengine-lite
numpy
peft<=0.9.0
pillow
protobuf
pydantic>2.0.0
pynvml
safetensors
sentencepiece
shortuuid
tiktoken
torch<=2.2.2,>=2.0.0
transformers
# triton>=2.1.0,<=2.2.0
uvicorn
pip install -r .\requirements.txt --extra-index-url https://download.pytorch.org/whl/cu118
只能成功了,太强了大佬
0.4.0安装成功了 可是为什么运行这个lmdeploy serve api_server liuhaotian/llava-v1.6-vicuna-7b --server-port 23333后返还的结果还是C:\Users\marti>lmdeploy serve api_server liuhaotian/llava-v1.6-vicuna-7b --server-port 23333
Traceback (most recent call last):
File "G:\Python-3.9.12\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "G:\Python-3.9.12\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "G:\Python-3.9.12\Scripts\lmdeploy.exe_main.py", line 7, in
代码有点问题,该打印的信息没打印出来。
你改下这个文件 G:\Python-3.9.12\lib\site-packages\lmdeploy\turbomind\__init__.py
# 这一行
if os.path.exists(os.path.join(pwd, 'lib')):
# 改成
if os.path.exists(os.path.join(pwd, '..', 'lib')):
C:\Users\marti>lmdeploy serve api_server liuhaotian/llava-v1.6-vicuna-7b --server-port 23333
Traceback (most recent call last):
File "G:\Python-3.9.12\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "G:\Python-3.9.12\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "G:\Python-3.9.12\Scripts\lmdeploy.exe_main.py", line 7, in
ok,这个报错正常了。
你缺少CUDA_PATH
这个环境变量。
可以这么设置一下(需要重启powershell 才能生效),不要抄我的,按照你之前log,你是有cuda11.8的,所以你要设置成这个 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8
或者你可以先在powershell 手动临时设置一下,然后再用命令启动
$env:CUDA_PATH="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8"
谢谢大佬我明白了,我去吃个饭,回来弄一下,应该没问题了,非常感谢耐心的解答!
不过你这个卡应该跑不了,不算视觉部分,7b的模型,需要14G才能加载,你的卡只有8G
@irexyc @lvhan028 两位前辈,顺带一提,lmdeploy什么时候可以像xtuner一样有一个自己的交流群呀(而不是使用internlm2的...
已经创建好了。readme中webchat链接有二维码
你可以试试这个模型,应该可以跑 deepseek-ai/deepseek-vl-1.3b-chat
# 一些额外的依赖
pip install git+https://github.com/deepseek-ai/DeepSeek-VL.git --no-deps
pip install torchvision --extra-index-url https://download.pytorch.org/whl/cu118
pip install attrdict timm
lmdeploy serve api_server deepseek-ai/deepseek-vl-1.3b-chat --cache-max-entry-count 0.3
G:\Python-3.9.12>lmdeploy serve api_server liuhaotian/llava-v1.6-vicuna-7b --server-port 23333 Add dll path {dll_path}, please note cuda version should >= 11.3 when compiled with cuda 11 urllib3.exceptions.SSLError: EOF occurred in violation of protocol (_ssl.c:1129)
The above exception was the direct cause of the following exception:
urllib3.exceptions.ProxyError: ('Unable to connect to proxy', SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1129)')))
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "G:\Python-3.9.12\lib\site-packages\requests\adapters.py", line 486, in send resp = conn.urlopen( File "G:\Python-3.9.12\lib\site-packages\urllib3\connectionpool.py", line 847, in urlopen retries = retries.increment( File "G:\Python-3.9.12\lib\site-packages\urllib3\util\retry.py", line 515, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /liuhaotian/llava-v1.6-vicuna-7b/resolve/main/config.json (Caused by ProxyError('Unable to connect to proxy', SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1129)'))))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "G:\Python-3.9.12\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "G:\Python-3.9.12\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "G:\Python-3.9.12\Scripts\lmdeploy.exe_main.py", line 7, in
urllib3.exceptions.ProxyError: ('Unable to connect to proxy', SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1129)')))
It looks like downloading the model failed.
Try to download the model manually and then pass its local path to the lmdeploy serve api_server
windows支持这么差吗,从安装到推理量化部署,报错没停过。有在win上跑通的案例吗
@irexyc May provide a best practice guide on windows platform
@BUJIDAOVS 我Mac也很差 都用windows+显卡嘛。 Mac Apple Silicon不行吗 🐶
LMDeploy is developed on CUDA platform. It cannot support MacOS
你好,我这个python版本是对的,为啥装不上:
ERROR: lmdeploy-0.5.3-cu118-cp310-cp310-manylinux2014_x86_64.whl is not a supported wheel on this platform.
cu118只能用118?
你好,我这个python版本是对的,为啥装不上:
ERROR: lmdeploy-0.5.3-cu118-cp310-cp310-manylinux2014_x86_64.whl is not a supported wheel on this platform.
python 环境为 3.10.2
操作系统是什么?
操作系统是什么? euleros
操作系统是什么? euleros
0.5.0的版本没问题,o(╥﹏╥)o
要不你试试源码编译?我们没有euleros的环境