[BUG/Help] Windows环境下运行本地int4模型报错
Is there an existing issue for this?
- [X] I have searched the existing issues
Current Behavior
运行web_demo.py,将地址改为本地int4模型,产生下列报错,已安装gcc(tdm64-gcc-10.3.0-2.exe),有显卡,正确安装cuda
gcc: 致命错误:cannot read spec file ‘libgomp.spec’: No such file or directory
Compile parallel cpu kernel gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.c -shared -o C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so failed.
Expected Behavior
No response
Steps To Reproduce
运行web_demo.py,将地址改为本地int4模型,产生下列报错,已安装gcc(tdm64-gcc-10.3.0-2.exe),有显卡,正确安装cuda
gcc: 致命错误:cannot read spec file ‘libgomp.spec’: No such file or directory
Compile parallel cpu kernel gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.c -shared -o C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so failed.
Environment
- OS: Windows
- Python: 3.11.4
- Transformers: 4.30.2
- PyTorch: 2.0.1+cu118
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : True
Anything else?
No response
编译并行 kernel 还需要 openmp,如果失败的话会 fallback 到非并行 kernel,你可以看一下后面还有什么报错
安装了openmp后,再运行报错不同了
Load parallel cpu kernel failed C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so: Traceback (most recent call last): File "C:\Users\xuqin/.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization.py", line 125, in init kernels = ctypes.cdll.LoadLibrary(kernel_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes_init_.py", line 454, in LoadLibrary return self.dlltype(name) ^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes_init.py", line 376, in init self._handle = _dlopen(self._name, mode) ^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: Could not find module 'C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so' (or one of its dependencies). Try using the full path with constructor syntax.
安装了openmp后,再运行报错不同了
Load parallel cpu kernel failed C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so: Traceback (most recent call last): File "C:\Users\xuqin/.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization.py", line 125, in init kernels = ctypes.cdll.LoadLibrary(kernel_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes__init__.py", line 454, in LoadLibrary return self.dlltype(name) ^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes__init_.py", line 376, in init self._handle = _dlopen(self._name, mode) ^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: Could not find module 'C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so' (or one of its dependencies). Try using the full path with constructor syntax.
我也是这个错误,请问你解决了吗
安装了openmp后,再运行报错不同了 Load parallel cpu kernel failed C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so: Traceback (most recent call last): File "C:\Users\xuqin/.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization.py", line 125, in init kernels = ctypes.cdll.LoadLibrary(kernel_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes__init__.py", line 454, in LoadLibrary return self.dlltype(name) ^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes__init_.py", line 376, in init self._handle = _dlopen(self._name, mode) ^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: Could not find module 'C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so' (or one of its dependencies). Try using the full path with constructor syntax.
我也是这个错误,请问你解决了吗
没有解决,不知道是不是还需要配置什么
安装了openmp后,再运行报错不同了 Load parallel cpu kernel failed C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so: Traceback (most recent call last): File "C:\Users\xuqin/.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization.py", line 125, in init kernels = ctypes.cdll.LoadLibrary(kernel_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes__init__.py", line 454, in LoadLibrary return self.dlltype(name) ^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes__init_.py", line 376, in init self._handle = _dlopen(self._name, mode) ^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: Could not find module 'C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so' (or one of its dependencies). Try using the full path with constructor syntax.
我也是这个错误,请问你解决了吗
没有解决,不知道是不是还需要配置什么
降低python版本到3.7 https://blog.csdn.net/weixin_42398658/article/details/119778719
安装了openmp后,再运行报错不同了
Load parallel cpu kernel failed C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so: Traceback (most recent call last): File "C:\Users\xuqin/.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization.py", line 125, in init kernels = ctypes.cdll.LoadLibrary(kernel_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes__init__.py", line 454, in LoadLibrary return self.dlltype(name) ^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes__init_.py", line 376, in init self._handle = _dlopen(self._name, mode) ^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: Could not find module 'C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so' (or one of its dependencies). Try using the full path with constructor syntax.
这边也是这个报错,但也会运行成功,就是非常慢,怀疑没有量化