ChatGLM2-6B [BUG/Help] Windows环境下运行本地int4模型报错

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

运行web_demo.py，将地址改为本地int4模型，产生下列报错，已安装gcc（tdm64-gcc-10.3.0-2.exe），有显卡，正确安装cuda

gcc: 致命错误：cannot read spec file ‘libgomp.spec’: No such file or directory

Compile parallel cpu kernel gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.c -shared -o C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so failed.

Expected Behavior

No response

Steps To Reproduce

运行web_demo.py，将地址改为本地int4模型，产生下列报错，已安装gcc（tdm64-gcc-10.3.0-2.exe），有显卡，正确安装cuda

gcc: 致命错误：cannot read spec file ‘libgomp.spec’: No such file or directory

Compile parallel cpu kernel gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.c -shared -o C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so failed.

Environment

- OS: Windows
- Python: 3.11.4
- Transformers: 4.30.2
- PyTorch: 2.0.1+cu118
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : True

Anything else?

No response

Jun 29 '23 04:06 skywolf123

编译并行 kernel 还需要 openmp，如果失败的话会 fallback 到非并行 kernel，你可以看一下后面还有什么报错

Jun 29 '23 04:06 duzx16

安装了openmp后，再运行报错不同了

Load parallel cpu kernel failed C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so: Traceback (most recent call last): File "C:\Users\xuqin/.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization.py", line 125, in init kernels = ctypes.cdll.LoadLibrary(kernel_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes_init_.py", line 454, in LoadLibrary return self.dlltype(name) ^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes_init.py", line 376, in init self._handle = _dlopen(self._name, mode) ^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: Could not find module 'C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so' (or one of its dependencies). Try using the full path with constructor syntax.

Jun 29 '23 05:06 skywolf123

安装了openmp后，再运行报错不同了

Load parallel cpu kernel failed C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so: Traceback (most recent call last): File "C:\Users\xuqin/.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization.py", line 125, in init kernels = ctypes.cdll.LoadLibrary(kernel_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes__init__.py", line 454, in LoadLibrary return self.dlltype(name) ^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes__init_.py", line 376, in init self._handle = _dlopen(self._name, mode) ^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: Could not find module 'C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so' (or one of its dependencies). Try using the full path with constructor syntax.

我也是这个错误，请问你解决了吗

Jun 30 '23 06:06 zuozihan

安装了openmp后，再运行报错不同了 Load parallel cpu kernel failed C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so: Traceback (most recent call last): File "C:\Users\xuqin/.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization.py", line 125, in init kernels = ctypes.cdll.LoadLibrary(kernel_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes__init__.py", line 454, in LoadLibrary return self.dlltype(name) ^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes__init_.py", line 376, in init self._handle = _dlopen(self._name, mode) ^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: Could not find module 'C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so' (or one of its dependencies). Try using the full path with constructor syntax.

我也是这个错误，请问你解决了吗

没有解决，不知道是不是还需要配置什么

Jul 03 '23 06:07 skywolf123

安装了openmp后，再运行报错不同了 Load parallel cpu kernel failed C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so: Traceback (most recent call last): File "C:\Users\xuqin/.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization.py", line 125, in init kernels = ctypes.cdll.LoadLibrary(kernel_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes__init__.py", line 454, in LoadLibrary return self.dlltype(name) ^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes__init_.py", line 376, in init self._handle = _dlopen(self._name, mode) ^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: Could not find module 'C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so' (or one of its dependencies). Try using the full path with constructor syntax.

我也是这个错误，请问你解决了吗

没有解决，不知道是不是还需要配置什么

降低python版本到3.7 https://blog.csdn.net/weixin_42398658/article/details/119778719

Jul 03 '23 06:07 zuozihan

安装了openmp后，再运行报错不同了

Load parallel cpu kernel failed C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so: Traceback (most recent call last): File "C:\Users\xuqin/.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization.py", line 125, in init kernels = ctypes.cdll.LoadLibrary(kernel_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes__init__.py", line 454, in LoadLibrary return self.dlltype(name) ^^^^^^^^^^^^^^^^^^^ File "D:\Anaconda3\envs\llm_env\Lib\ctypes__init_.py", line 376, in init self._handle = _dlopen(self._name, mode) ^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: Could not find module 'C:\Users\xuqin.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\quantization_kernels_parallel.so' (or one of its dependencies). Try using the full path with constructor syntax.

这边也是这个报错，但也会运行成功，就是非常慢，怀疑没有量化

Aug 31 '23 10:08 adai-5090