ChatGLM-6B icon indicating copy to clipboard operation
ChatGLM-6B copied to clipboard

[BUG/Help] <title>Windows环境下使用GPU加载INT-4模型报错

Open OathK1per opened this issue 1 year ago • 1 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

No compiled kernel found. Compiling kernels : C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels_parallel.c Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels_parallel.c -shared -o C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels_parallel.so e:/mingw/bin/../lib/gcc/mingw32/6.3.0/../../../../mingw32/bin/ld.exe: cannot find -lpthread collect2.exe: error: ld returned 1 exit status Compile default cpu kernel failed, using default cpu kernel code. Compiling gcc -O3 -fPIC -std=c99 C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels.c -shared -o C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels.so Load default cpu kernel failed: Traceback (most recent call last): File "C:\Users\admin/.cache\huggingface\modules\transformers_modules\chatglm\quantization.py", line 167, in init kernels = ctypes.cdll.LoadLibrary(kernel_file) File "C:\Users\admin\AppData\Local\Programs\Python\Python310\lib\ctypes_init_.py", line 452, in LoadLibrary return self.dlltype(name) File "C:\Users\admin\AppData\Local\Programs\Python\Python310\lib\ctypes_init.py", line 374, in init self._handle = _dlopen(self._name, mode) OSError: [WinError 193] %1 不是有效的 Win32 应用程序。

Failed to load kernel. Cannot load cpu kernel, don't use quantized model on cpu. Using quantization cache Applying quantization to glm layers No compiled kernel found. Compiling kernels : C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels_parallel.c Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels_parallel.c -shared -o C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels_parallel.so e:/mingw/bin/../lib/gcc/mingw32/6.3.0/../../../../mingw32/bin/ld.exe: cannot find -lpthread collect2.exe: error: ld returned 1 exit status Compile default cpu kernel failed, using default cpu kernel code. Compiling gcc -O3 -fPIC -std=c99 C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels.c -shared -o C:\Users\admin.cache\huggingface\modules\transformers_modules\chatglm\quantization_kernels.so Load default cpu kernel failed: Traceback (most recent call last): File "C:\Users\admin/.cache\huggingface\modules\transformers_modules\chatglm\quantization.py", line 167, in init kernels = ctypes.cdll.LoadLibrary(kernel_file) File "C:\Users\admin\AppData\Local\Programs\Python\Python310\lib\ctypes_init_.py", line 452, in LoadLibrary return self.dlltype(name) File "C:\Users\admin\AppData\Local\Programs\Python\Python310\lib\ctypes_init.py", line 374, in init self._handle = _dlopen(self._name, mode) OSError: [WinError 193] %1 不是有效的 Win32 应用程序。

Failed to load kernel. Traceback (most recent call last): File "", line 1, in File "C:\Users\admin/.cache\huggingface\modules\transformers_modules\chatglm\modeling_chatglm.py", line 1430, in quantize load_cpu_kernel(**kwargs) File "C:\Users\admin/.cache\huggingface\modules\transformers_modules\chatglm\quantization.py", line 430, in load_cpu_kernel assert cpu_kernels.load AssertionError

Expected Behavior

No response

Steps To Reproduce

Windows环境下使用GPU加载INT-4模型,前面顺利,使用MingW下载的gcc,无法编译quantization_kernels_parallel

Environment

- OS:Windows10
- Python:3.10.10
- Transformers:4.27.1
- PyTorch:2.0.0+cu118
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :True

Anything else?

No response

OathK1per avatar Apr 18 '23 08:04 OathK1per

你是不是windows10 32bit的系统?

YIZXIY avatar Apr 18 '23 10:04 YIZXIY

可以换一种部署方式WSL Windows部署文档

ZhangErling avatar Apr 19 '23 09:04 ZhangErling

同样遇到这个问题,卡了很久,请问题主是使用了lpthread解决的么?我下载了lpthread并更改了系统变量后还是不行,如果题主能指点一下,将不甚感激 Environment

- OS:Windows11
- Python:3.9.0
- Transformers:4.26.1
- PyTorch:2.0.1+cu118
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :True
- ```

AllenXiao95 avatar Jun 16 '23 18:06 AllenXiao95