ChatGLM-6B [BUG/Help] <求助>

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

当我使用（。。。。。。。。。trust_remote_code=True).quantize(8).half().cuda()运行代码时，抛出以下异常：AttributeError: 'NoneType' object has no attribute 'int8WeightExtractionHalf'

Expected Behavior

No response

Steps To Reproduce

运行量化int8时报没有相应的函数

Environment

- OS:windwos10
- Python:3.9.13
- Transformers:4.27.1
- PyTorch:1.13.0
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

No response

Apr 10 '23 08:04 572120986

使用的环境是否有GPU？另外，请提供完整的运行时log

Apr 10 '23 09:04 duzx16

是的有GPU

D:\python3.9.13\python.exe D:/pythonWork/ChatGLM-6B-main/cli_demo.py Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Loading checkpoint shards: 100%|██████████| 8/8 [00:12<00:00, 1.56s/it] Failed to load cpm_kernels:[WinError 267] 目录名称无效。: 'C:\Windows\System32\zlibwapi.dll' 欢迎使用 ChatGLM-6B 模型，输入内容即可进行对话，clear 清空对话历史，stop 终止程序

用户：你好 Traceback (most recent call last): File "D:\pythonWork\ChatGLM-6B-main\cli_demo.py", line 57, in main() File "D:\pythonWork\ChatGLM-6B-main\cli_demo.py", line 42, in main for response, history in model.stream_chat(tokenizer, query, history=history): File "D:\python3.9.13\lib\site-packages\torch\autograd\grad_mode.py", line 43, in generator_context response = gen.send(None) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\modeling_chatglm.py", line 1279, in stream_chat for outputs in self.stream_generate(**inputs, **gen_kwargs): File "D:\python3.9.13\lib\site-packages\torch\autograd\grad_mode.py", line 43, in generator_context response = gen.send(None) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\modeling_chatglm.py", line 1356, in stream_generate outputs = self( File "D:\python3.9.13\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\modeling_chatglm.py", line 1158, in forward transformer_outputs = self.transformer( File "D:\python3.9.13\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\modeling_chatglm.py", line 971, in forward layer_ret = layer( File "D:\python3.9.13\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\modeling_chatglm.py", line 612, in forward attention_outputs = self.attention( File "D:\python3.9.13\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\modeling_chatglm.py", line 437, in forward mixed_raw_layer = self.query_key_value(hidden_states) File "D:\python3.9.13\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\quantization.py", line 147, in forward output = W8A16Linear.apply(input, self.weight, self.weight_scale, self.weight_bit_width) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\quantization.py", line 51, in forward weight = extract_weight_to_half(quant_w, scale_w, weight_bit_width) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\quantization.py", line 90, in extract_weight_to_half func = kernels.int8WeightExtractionHalf AttributeError: 'NoneType' object has no attribute 'int8WeightExtractionHalf'

Process finished with exit code 1

Apr 10 '23 09:04 572120986

是的有GPU

D:\python3.9.13\python.exe D:/pythonWork/ChatGLM-6B-main/cli_demo.py Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Loading checkpoint shards: 100%|██████████| 8/8 [00:12<00:00, 1.56s/it] Failed to load cpm_kernels:[WinError 267] 目录名称无效。: 'C:\Windows\System32\zlibwapi.dll' 欢迎使用 ChatGLM-6B 模型，输入内容即可进行对话，clear 清空对话历史，stop 终止程序

用户：你好 Traceback (most recent call last): File "D:\pythonWork\ChatGLM-6B-main\cli_demo.py", line 57, in main() File "D:\pythonWork\ChatGLM-6B-main\cli_demo.py", line 42, in main for response, history in model.stream_chat(tokenizer, query, history=history): File "D:\python3.9.13\lib\site-packages\torch\autograd\grad_mode.py", line 43, in generator_context response = gen.send(None) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\modeling_chatglm.py", line 1279, in stream_chat for outputs in self.stream_generate(**inputs, **gen_kwargs): File "D:\python3.9.13\lib\site-packages\torch\autograd\grad_mode.py", line 43, in generator_context response = gen.send(None) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\modeling_chatglm.py", line 1356, in stream_generate outputs = self( File "D:\python3.9.13\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\modeling_chatglm.py", line 1158, in forward transformer_outputs = self.transformer( File "D:\python3.9.13\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\modeling_chatglm.py", line 971, in forward layer_ret = layer( File "D:\python3.9.13\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\modeling_chatglm.py", line 612, in forward attention_outputs = self.attention( File "D:\python3.9.13\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\modeling_chatglm.py", line 437, in forward mixed_raw_layer = self.query_key_value(hidden_states) File "D:\python3.9.13\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\quantization.py", line 147, in forward output = W8A16Linear.apply(input, self.weight, self.weight_scale, self.weight_bit_width) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\quantization.py", line 51, in forward weight = extract_weight_to_half(quant_w, scale_w, weight_bit_width) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\chatglm-6b\quantization.py", line 90, in extract_weight_to_half func = kernels.int8WeightExtractionHalf AttributeError: 'NoneType' object has no attribute 'int8WeightExtractionHalf'

Process finished with exit code 1

根据log，应该是

[WinError 267] 目录名称无效。: 'C:\Windows\System32\zlibwapi.dll'

这个报错

Apr 10 '23 10:04 duzx16

但是这个路径下确实存在zlibwapi.dll这个文件。我是torch 1.13.1 cuda11.6

Apr 10 '23 11:04 572120986

但是这个路径下确实存在zlibwapi.dll这个文件。我是torch 1.13.1 cuda11.6 def lookup_dll(prefix): paths = os.environ.get("PATH", "").split(os.pathsep) for path in paths: if not os.path.exists(path) or ("zlibwapi" in path): continue for name in os.listdir(path): if name.startswith(prefix) and name.lower().endswith(".dll"): return os.path.join(path, name) return None

自己加一个过滤判断就行了

Apr 16 '23 11:04 Jason916

请问现在解决了么如何解决的呢？我尝试使用 int8 也是相同报错

May 29 '23 11:05 jinzijian

请问现在解决了么如何解决的呢？我尝试使用 int8 也是相同报错

已经解决了，你直接看我之前的回复，，加一个过滤判断 if not os.path.exists(path) or ("zlibwapi" in path): continue

May 30 '23 16:05 Jason916

ChatGLM-6B ChatGLM-6B copied to clipboard

[BUG/Help] <求助>

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

ChatGLM-6B
ChatGLM-6B copied to clipboard