Q-LLM
Q-LLM copied to clipboard
AttributeError: 'RotaryEmbeddingESM' object has no attribute 'shape'
Thanks for your published code.
I encounter one problem when running the code as described in the Usage.
My code is as follows
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoConfig, LlamaForCausalLM
import transformers
from omegaconf import OmegaConf
from qllm.utils import patch_hf, GreedySearch, patch_model_center
conf = OmegaConf.load("../config/llama-qllm-repr4-l1k-bs128-topk8-w4.yaml")
model_path = "XXX"
model = AutoModelForCausalLM.from_pretrained(
model_path,
torch_dtype=torch.bfloat16,
trust_remote_code=True
).to("cuda:0")
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True, add_bos_token=True, add_eos_token=False)
model = patch_hf(model, "qllm", conf.model)
model = GreedySearch(model, tokenizer)
text = "XXX"
encoded_text = tokenizer.encode(text)
input_ids = torch.tensor(encoded_text).unsqueeze(0).to("cuda:0")
# your own usage
output = model.generate(input_ids, max_length=200)
The error log is as follows.
cos, sin = self.rotary_emb(value_states, position_ids)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/transformers/models/llama/modeling_llama.py", line 109, in forward
inv_freq_expanded = self.inv_freq[None, :, None].float().expand(position_ids.shape[0], -1, 1)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1709, in __getattr__
raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'RotaryEmbeddingESM' object has no attribute 'shape'
My transformers version is 4.39.2
hi, you can use our models in qllm/models like from qllm.models import LlamaForCausalLM (This will allow the param question_ids in forward)
The complete code is as following:
import torch
from qllm.models import LlamaForCausalLM
from transformers import AutoTokenizer
import transformers
from omegaconf import OmegaConf
from qllm.utils import patch_hf, GreedySearch, patch_model_center
conf = OmegaConf.load("config/llama3-qllm-repr4-l1k-bs128-topk8-w4.yaml")
model_path = "models/Meta-Llama-3-8B-Instruct"
model = LlamaForCausalLM.from_pretrained(
model_path,
torch_dtype=torch.bfloat16,
trust_remote_code=True
).to("cuda:0")
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True, add_bos_token=True, add_eos_token=False)
model = patch_hf(model, "qllm", conf.model)
model = GreedySearch(model, tokenizer)
text = "xxx"
encoded_text = tokenizer.encode(text)
input_ids = torch.tensor(encoded_text).unsqueeze(0).to("cuda:0")
output = model.generate(input_ids, max_length=200)
print(output)
This works in the testing environment with transformers version 4.40.1.