ChatGLM-6B
ChatGLM-6B copied to clipboard
[BUG/Help] <title>Windows环境下使用GPU加载INT-4模型报错
Is there an existing issue for this?
- [X] I have searched the existing issues
Current Behavior
No compiled kernel found. Compiling kernels : C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels_parallel.c Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels_parallel.c -shared -o C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels_parallel.so e:/mingw/bin/../lib/gcc/mingw32/6.3.0/../../../../mingw32/bin/ld.exe: cannot find -lpthread collect2.exe: error: ld returned 1 exit status Compile default cpu kernel failed, using default cpu kernel code. Compiling gcc -O3 -fPIC -std=c99 C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels.c -shared -o C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels.so Load default cpu kernel failed: Traceback (most recent call last): File "C:\Users\admin/.cache\huggingface\modules\transformers_modules\chatglm\quantization.py", line 167, in init kernels = ctypes.cdll.LoadLibrary(kernel_file) File "C:\Users\admin\AppData\Local\Programs\Python\Python310\lib\ctypes_init_.py", line 452, in LoadLibrary return self.dlltype(name) File "C:\Users\admin\AppData\Local\Programs\Python\Python310\lib\ctypes_init.py", line 374, in init self._handle = _dlopen(self._name, mode) OSError: [WinError 193] %1 不是有效的 Win32 应用程序。
Failed to load kernel. Cannot load cpu kernel, don't use quantized model on cpu. Using quantization cache Applying quantization to glm layers No compiled kernel found. Compiling kernels : C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels_parallel.c Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels_parallel.c -shared -o C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels_parallel.so e:/mingw/bin/../lib/gcc/mingw32/6.3.0/../../../../mingw32/bin/ld.exe: cannot find -lpthread collect2.exe: error: ld returned 1 exit status Compile default cpu kernel failed, using default cpu kernel code. Compiling gcc -O3 -fPIC -std=c99 C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels.c -shared -o C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels.so Load default cpu kernel failed: Traceback (most recent call last): File "C:\Users\admin/.cache\huggingface\modules\transformers_modules\chatglm\quantization.py", line 167, in init kernels = ctypes.cdll.LoadLibrary(kernel_file) File "C:\Users\admin\AppData\Local\Programs\Python\Python310\lib\ctypes_init_.py", line 452, in LoadLibrary return self.dlltype(name) File "C:\Users\admin\AppData\Local\Programs\Python\Python310\lib\ctypes_init.py", line 374, in init self._handle = _dlopen(self._name, mode) OSError: [WinError 193] %1 不是有效的 Win32 应用程序。
Failed to load kernel.
Traceback (most recent call last):
File "
Expected Behavior
No response
Steps To Reproduce
Windows环境下使用GPU加载INT-4模型,前面顺利,使用MingW下载的gcc,无法编译quantization_kernels_parallel
Environment
- OS:Windows10
- Python:3.10.10
- Transformers:4.27.1
- PyTorch:2.0.0+cu118
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :True
Anything else?
No response
你是不是windows10 32bit的系统?
可以换一种部署方式WSL Windows部署文档
同样遇到这个问题,卡了很久,请问题主是使用了lpthread解决的么?我下载了lpthread并更改了系统变量后还是不行,如果题主能指点一下,将不甚感激 Environment
- OS:Windows11
- Python:3.9.0
- Transformers:4.26.1
- PyTorch:2.0.1+cu118
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :True
- ```