[BUG/Help] TypeError reported when model.chat() called

Open lixjohn opened this issue 1 year ago • 0 comments

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

To run chatglm-6b-int4 on my local machine, I use example code as follow _from transformers import AutoTokenizer, AutoModel import torch

modelname = "D:\\models\\zhipu\\chatglm-6b-int4"

tokenizer = AutoTokenizer.from_pretrained(modelname, trust_remote_code=True)
model = AutoModel.from_pretrained(modelname, trust_remote_code=True).float()
model = model.quantize(bits=4, kernel_file="D:\\models\\zhipu\\chatglm-6b-int4\\quantization_kernels_parallel.so")
message = '你好'
response, history = model.chat(tokenizer, message, history=[])
print(response)

response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=history)
print(response)_

However, I got TypeError: expected Tensor as element 0 in argument 0, but got tuple when I run this code, please see error messsage in attach file[ Error Message.txt ](url)

I traced code, found the error reported by line 254 in modeling_chatglm.py. if layer_past is not None: past_key, past_value = layer_past[0], layer_past[1] if (type(layer_past) != str): key_layer = torch.cat((past_key, key_layer), dim=0) value_layer = torch.cat((past_value, value_layer), dim=0)

for some unknow reasons, layer_past was set as string 'past_key_values' 1731938883153 So past_key='p' and past_value='a', which caused argument mismatch error in torch.cat

More information for files update

In order to make code runing on CPU, I changed line 161 in quantization.py from #kernels = ctypes.cdll.LoadLibrary(kernel_file) to kernels = ctypes.CDLL(kernel_file,winmode=0)
In order to avoid sp_tokenizer is not defined error, I move code self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) from 205 to 182 in tokenization_chatglm.py

Expected Behavior

No error in model.chat call

Steps To Reproduce

Please see Behavior section

Environment

- OS:Windows 11 家庭中文版
- Python: Python 3.12.7
- Transformers:Version: 4.42.0
- PyTorch:Version: 2.5.1
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : False

Anything else?

No response

Nov 18 '24 14:11 lixjohn