ChatGLM-6B icon indicating copy to clipboard operation
ChatGLM-6B copied to clipboard

采用int4量化模型出现以下错误:AttributeError: 'NoneType' object has no attribute 'int4WeightExtractionFloat'

Open fryng opened this issue 1 year ago • 1 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

model = AutoModel.from_pretrained("THUDM/chatglm-6b-int4-qe", trust_remote_code=True).float() 采用int4量化模型出现以下错误:AttributeError: 'NoneType' object has no attribute 'int4WeightExtractionFloat'

Expected Behavior

No response

Steps To Reproduce

Environment

- OS:windows11
- Python:3.8
- Transformers:-
- PyTorch:2.0
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :False

Anything else?

No response

fryng avatar Mar 24 '23 05:03 fryng

检查一下是否 python 是 64位,gcc 编译 32 位 .so

YufengSoft avatar Mar 24 '23 10:03 YufengSoft

碰到同样的问题,都是64位的 image

dinfer avatar Mar 31 '23 08:03 dinfer

请问是不是CPU kernel加载失败了?可以提供一下完整的输出?

songxxzp avatar Apr 03 '23 06:04 songxxzp

output = W8A16LinearCPU.apply(input, self.weight, self.weight_scale, self.weight_bit_width, self.quantization_cache)

File "C:\Users\cm/.cache\huggingface\modules\transformers_modules\chatglm-6b-int4\quantization.py", line 76, in forward weight = extract_weight_to_float(quant_w, scale_w, weight_bit_width, quantization_cache=quantization_cache)

File "C:\Users\cm/.cache\huggingface\modules\transformers_modules\chatglm-6b-int4\quantization.py", line 260, in extract_weight_to_float func = cpu_kernels.int4WeightExtractionFloat

AttributeError: 'NoneType' object has no attribute 'int4WeightExtractionFloat'

hellonlp avatar Apr 14 '23 02:04 hellonlp

环境不同,但是报错一致,环境如下:

Environment

  • OS:MacOS 13.0
  • Python:3.8
  • Transformers:-
  • MPS Support (python -c "import torch; print(torch.backends.mps.is_available())") :True

采用CPU的方式+int4量化模型可以正常运行 model = AutoModel.from_pretrained("local path", trust_remote_code=True).float() 采用MPS的方式+int4量化模型 model = AutoModel.from_pretrained("local path", trust_remote_code=True).half().to('mps') 报错如下:

Traceback (most recent call last):
  File "/PycharmProjects/ChatGLM-6B/cli_demo.py", line 58, in <module>
    main()
  File "/PycharmProjects/ChatGLM-6B/cli_demo.py", line 43, in main
    for response, history in model.stream_chat(tokenizer, query, history=history):
  File "/miniconda3/envs/ChatGLM-6B/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
  File "/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/modeling_chatglm.py", line 1312, in stream_chat
    for outputs in self.stream_generate(**inputs, **gen_kwargs):
  File "/miniconda3/envs/ChatGLM-6B/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
  File "/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/modeling_chatglm.py", line 1389, in stream_generate
    outputs = self(
  File "/miniconda3/envs/ChatGLM-6B/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/modeling_chatglm.py", line 1191, in forward
    transformer_outputs = self.transformer(
  File "/miniconda3/envs/ChatGLM-6B/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/modeling_chatglm.py", line 997, in forward
    layer_ret = layer(
  File "/miniconda3/envs/ChatGLM-6B/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/modeling_chatglm.py", line 627, in forward
    attention_outputs = self.attention(
  File "/miniconda3/envs/ChatGLM-6B/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/modeling_chatglm.py", line 445, in forward
    mixed_raw_layer = self.query_key_value(hidden_states)
  File "/miniconda3/envs/ChatGLM-6B/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/quantization.py", line 375, in forward
    output = W8A16Linear.apply(input, self.weight, self.weight_scale, self.weight_bit_width)
  File "/miniconda3/envs/ChatGLM-6B/lib/python3.8/site-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/quantization.py", line 53, in forward
    weight = extract_weight_to_half(quant_w, scale_w, weight_bit_width)
  File "/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/quantization.py", line 262, in extract_weight_to_half
    func = kernels.int4WeightExtractionHalf
AttributeError: 'NoneType' object has no attribute 'int4WeightExtractionHalf'

ioiogoo avatar Apr 20 '23 09:04 ioiogoo

model = AutoModel.from_pretrained("localpath", trust_remote_code=True).float().to('mps')

func = kernels.int4WeightExtractionHalf

AttributeError: 'NoneType' object has no attribute 'int4WeightExtractionHalf'

lee528066 avatar Apr 29 '23 13:04 lee528066

pip install cpm_kernels

dereksjtu avatar May 07 '23 03:05 dereksjtu

pip install cpm_kernels

That worked. By the way, what's the meaning of cpm?

JungleG avatar May 30 '23 06:05 JungleG

pip install cpm_kernels Requirement already satisfied: cpm_kernels in d:\anaconda3\envs\chatglm\lib\site-packages (1.0.11)

duyanke888 avatar Jun 07 '23 18:06 duyanke888

不行啊,还是报同样的错误。

IamHimon avatar Jun 28 '23 07:06 IamHimon

你好,请问你解决了吗?

IamHimon avatar Jun 28 '23 08:06 IamHimon

量化后的模型只支持cuda吧。 第一版chatglm是这样的

yichengming avatar Jul 05 '23 07:07 yichengming

您好,遇到了同样的问题,问题解决了么?

littleyanglovegithub avatar Aug 10 '23 00:08 littleyanglovegithub

Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Failed to load cpm_kernels:[WinError 267] 目录名称无效。: 'D:\\software\\Graphviz\\bin\\dot.exe'
'gcc' �����ڲ����ⲿ���Ҳ���ǿ����еij���
���������ļ���
Compile parallel cpu kernel gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\cc\.cache\huggingface\modules\transformers_modules\local\quantization_kernels_parallel.c -shared -o C:\Users\cc\.cache\huggingface\modules\transformers_modules\local\quantization_kernels_parallel.so failed.
'gcc' �����ڲ����ⲿ���Ҳ���ǿ����еij���
���������ļ���
Compile cpu kernel gcc -O3 -fPIC -std=c99 C:\Users\cc\.cache\huggingface\modules\transformers_modules\local\quantization_kernels.c -shared -o C:\Users\cc\.cache\huggingface\modules\transformers_modules\local\quantization_kernels.so failed.
Traceback (most recent call last):
  File "E:\data\Projects\chatglm2-6b\main.py", line 11, in <module>
    response, history = model.chat(tokenizer, "你好", history=[])
  File "D:\software\anaconda3\envs\chatglm2-6b\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\cc/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 1028, in chat
    outputs = self.generate(**inputs, **gen_kwargs)
  File "D:\software\anaconda3\envs\chatglm2-6b\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\software\anaconda3\envs\chatglm2-6b\lib\site-packages\transformers\generation\utils.py", line 1437, in generate
    return self.sample(
  File "D:\software\anaconda3\envs\chatglm2-6b\lib\site-packages\transformers\generation\utils.py", line 2443, in sample
    outputs = self(
  File "D:\software\anaconda3\envs\chatglm2-6b\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\cc/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 932, in forward
    transformer_outputs = self.transformer(
  File "D:\software\anaconda3\envs\chatglm2-6b\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\cc/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 828, in forward
    hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder(
  File "D:\software\anaconda3\envs\chatglm2-6b\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\cc/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 638, in forward
    layer_ret = layer(
  File "D:\software\anaconda3\envs\chatglm2-6b\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\cc/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 542, in forward
    attention_output, kv_cache = self.self_attention(
  File "D:\software\anaconda3\envs\chatglm2-6b\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\cc/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 374, in forward
    mixed_x_layer = self.query_key_value(hidden_states)
  File "D:\software\anaconda3\envs\chatglm2-6b\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\cc/.cache\huggingface\modules\transformers_modules\local\quantization.py", line 502, in forward
    output = W8A16Linear.apply(input, self.weight, self.weight_scale, self.weight_bit_width)
  File "D:\software\anaconda3\envs\chatglm2-6b\lib\site-packages\torch\autograd\function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "C:\Users\cc/.cache\huggingface\modules\transformers_modules\local\quantization.py", line 75, in forward
    weight = extract_weight_to_half(quant_w, scale_w, weight_bit_width)
  File "C:\Users\cc/.cache\huggingface\modules\transformers_modules\local\quantization.py", line 287, in extract_weight_to_half
    func = kernels.int4WeightExtractionHalf
AttributeError: 'NoneType' object has no attribute 'int4WeightExtractionHalf'

我也遇到这个问题了,pip install cpm_kernels也装过了,到那个路径下找'D:\software\Graphviz\bin\dot.exe'也有这个文件,楼主咋解决的

fskz avatar Sep 22 '23 01:09 fskz

同样出现了这个问题 使用的chatglm2 6b int4 https://github.com/chatchat-space/Langchain-Chatchat/issues/1995

BIM4SmartHydropower avatar Nov 08 '23 13:11 BIM4SmartHydropower

我解决了这个问题。使用的是chatglm6b-int4,主要是发现cpm_kernels有一个函数会遍历所有环境变量,但它把所有环境变量的路径都当成目录,导致里面有个os.listdir的操作造成报错,但又没进行错误处理,于是一直往外传,导致了kernel读取失败。

解决方案: 把cpm_kernels里的library的base.py文件中的lookup_dll函数改为以下可以解决

def lookup_dll(prefix): paths = os.environ.get("PATH", "").split(os.pathsep) for path in paths: if not os.path.exists(path): continue try: for name in os.listdir(path): if name.startswith(prefix) and name.lower().endswith(".dll"): return os.path.join(path, name) except Exception as e: print(e) return None

usamimeri avatar Nov 18 '23 14:11 usamimeri

我解决了这个问题。使用的是chatglm6b-int4,主要是发现cpm_kernels有一个函数会遍历所有环境变量,但它把所有环境变量的路径都当成目录,导致里面有个os.listdir的操作造成报错,但又没进行错误处理,于是一直往外传,导致了kernel读取失败。

解决方案: 把cpm_kernels里的library的base.py文件中的lookup_dll函数改为以下可以解决

def lookup_dll(prefix): paths = os.environ.get("PATH", "").split(os.pathsep) for path in paths: if not os.path.exists(path): continue try: for name in os.listdir(path): if name.startswith(prefix) and name.lower().endswith(".dll"): return os.path.join(path, name) except Exception as e: print(e) return None

我修改了该文件,还是没解决这个问题

340090738 avatar Mar 24 '24 17:03 340090738