peft
peft copied to clipboard
AttributeError: 'NoneType' object has no attribute 'shape'
System Info
python 3.10
Who can help?
No response
Information
- [ ] The official example scripts
- [X] My own modified scripts
Tasks
- [ ] An officially supported task in the
examplesfolder - [X] My own task or dataset (give details below)
Reproduction
train_losses = []
for epoch in range(epochs):
model.train()
epoch_loss = 0
for batch in train_loader:
optimizer.zero_grad()
input_ids = batch["input_ids"].to(device)
print("Input IDs:", input_ids) # 添加此行打印 input_ids
attention_mask = batch["attention_mask"].to(device)
labels = batch["labels"].to(device)
max_seq_length = input_ids.shape[1]
padded_labels = torch.nn.functional.pad(labels, (0, max_seq_length - labels.shape[1]), value=-100).to(device)
print('------------------------------')
print("Input IDs size:", input_ids.size)
print(f"Attention Mask size: {attention_mask.size}")
print(f"Padded Labels size: {padded_labels.size}")
if input_ids is None:
raise ValueError("input_ids is None")
if attention_mask is None:
raise ValueError("attention_mask is None")
if padded_labels is None:
raise ValueError("padded_labels is None")
print("Type of input_ids:", type(input_ids))
print("Type of attention_mask:", type(attention_mask))
print("Type of padded_labels:", type(padded_labels))
print(model.forward)
from peft import PeftModelForSequenceClassification
c
print(PeftModelForSequenceClassification.forward)
outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=padded_labels)
loss = outputs.loss
loss.backward()
optimizer.step()
epoch_loss += loss.item()
train_losses.append(epoch_loss / len(train_loader))
Expected behavior
Traceback (most recent call last):
File "/share/home/aim/aim_zhujj/bc2/glm_bc2_pt.py", line 114, in <module>
outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=padded_labels)
File "/data/aim_nuist/aim_zhujj/.conda/envs/pytorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data/aim_nuist/aim_zhujj/.conda/envs/pytorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/data/aim_nuist/aim_zhujj/.conda/envs/pytorch/lib/python3.10/site-packages/peft/peft_model.py", line 1283, in forward
return self.base_model(inputs_embeds=inputs_embeds, **kwargs)
File "/data/aim_nuist/aim_zhujj/.conda/envs/pytorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data/aim_nuist/aim_zhujj/.conda/envs/pytorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/share/home/aim/aim_zhujj/.cache/huggingface/modules/transformers_modules/glm-4-9b-chat/modeling_chatglm.py", line 878, in forward
transformer_outputs = self.transformer(
File "/data/aim_nuist/aim_zhujj/.conda/envs/pytorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data/aim_nuist/aim_zhujj/.conda/envs/pytorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/share/home/aim/aim_zhujj/.cache/huggingface/modules/transformers_modules/glm-4-9b-chat/modeling_chatglm.py", line 757, in forward
batch_size, seq_length = input_ids.shape
AttributeError: 'NoneType' object has no attribute 'shape'
Could you please provide the code that loads the base model and then applies the PEFT model on top?
您能否提供加载基本模型然后在顶部应用 PEFT 模型的代码?
tokenizer = AutoTokenizer.from_pretrained('/data/aim_nuist/aim_zhujj/xinjian/glm4_lora/ZhipuAI/glm-4-9b-chat', trust_remote_code=True) base_model = AutoModelForCausalLM.from_pretrained( '/data/aim_nuist/aim_zhujj/xinjian/glm4_lora/ZhipuAI/glm-4-9b-chat', low_cpu_mem_usage=True, trust_remote_code=True ).to(device).eval()
tokenizer.pad_token = tokenizer.eos_token
config = PromptEncoderConfig( task_type=TaskType.TOKEN_CLS, num_virtual_tokens=10, encoder_reparameterization_type=PromptEncoderReparameterizationType.MLP, encoder_dropout=0.1, encoder_num_layers=4, encoder_hidden_size=4096) model = get_peft_model(base_model, config)
Thanks for the additional details. I could reproduce the error using the model THUDM/glm-4-9b-chat. The issue is that this model uses custom code, which is not compatible with PromptEncoder. As the error indicates, this line fails:
batch_size, seq_length = input_ids.shape AttributeError: 'NoneType' object has no attribute 'shape'
This is because the prompt encoder does not pass input_ids, instead passing the inputs_embeds directly. I could patch the issue by using inputs_embeds instead by editing this line:
https://huggingface.co/THUDM/glm-4-9b-chat/blob/c24133cef34ff7a7010f1e97c113effdead0966b/modeling_chatglm.py#L875
if input_ids is not None:
batch_size, seq_length = input_ids.shape
else:
batch_size, seq_length, _ = inputs_embeds.shape
Not sure if that's an option for you or not. Maybe you could also create a PR on the ChatGLM repo to suggest this change.
Thanks for the additional details. I could reproduce the error using the model
THUDM/glm-4-9b-chat. The issue is that this model uses custom code, which is not compatible withPromptEncoder. As the error indicates, this line fails:batch_size, seq_length = input_ids.shape AttributeError: 'NoneType' object has no attribute 'shape'
This is because the prompt encoder does not pass
input_ids, instead passing theinputs_embedsdirectly. I could patch the issue by usinginputs_embedsinstead by editing this line:https://huggingface.co/THUDM/glm-4-9b-chat/blob/c24133cef34ff7a7010f1e97c113effdead0966b/modeling_chatglm.py#L875
if input_ids is not None: batch_size, seq_length = input_ids.shape else: batch_size, seq_length, _ = inputs_embeds.shapeNot sure if that's an option for you or not. Maybe you could also create a PR on the ChatGLM repo to suggest this change.
thank you ,if this line should change? outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=padded_labels)
if this line should change? outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=padded_labels)
No, this line can stay as is. PEFT will handle the extension of the embeddings internally.
if this line should change? outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=padded_labels)
No, this line can stay as is. PEFT will handle the extension of the embeddings internally.
When I change this code(batch_size, seq_length = input_ids.shape), it always automatically reverts from hugging face to before the changeWhen I change this code, it always automatically reverts from hugging face to before the change
When I change this code, it always automatically reverts from hugging face to before the change
You mean the code in modeling_chatglm.py? Maybe there is a logic to check if the file needs updating, and then it is re-downloaded, overwriting the change. Instead, you could try monkey patching the forward method with a custom method that includes the suggested change.
you could try monkey patching the
forwardmethod with a custom method that includes the suggested change.
sorry,there is an new issue:
Traceback (most recent call last):
File "/share/home/aim/aim_zhujj/bc2/glm_bc2_pt.py", line 141, in
I cannot reproduce this. Since I don't know what data you used, I'm using some dummy data:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import get_peft_model, PromptEncoderReparameterizationType, PromptEncoderConfig, TaskType
model_id = "THUDM/glm-4-9b-chat"
base_model = AutoModelForCausalLM.from_pretrained(
model_id, low_cpu_mem_usage=True, device_map=0, trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
inputs = tokenizer("hello world", return_tensors="pt").to(0)
config = PromptEncoderConfig(
task_type=TaskType.TOKEN_CLS,
num_virtual_tokens=10,
encoder_reparameterization_type=PromptEncoderReparameterizationType.MLP,
encoder_dropout=0.1,
encoder_num_layers=4,
encoder_hidden_size=4096,
)
model = get_peft_model(base_model, config)
labels = torch.nn.functional.pad(inputs["input_ids"], (0, config.num_virtual_tokens), value=-100).to(0)
output = model(input_ids=inputs["input_ids"], attention_mask=inputs["attention_mask"], labels=labels)
print(type(output))
# prints: <class 'transformers.modeling_outputs.CausalLMOutputWithPast'>
print(output.loss)
# prints: tensor(13., device='cuda:0', dtype=torch.bfloat16, grad_fn=<ToCopyBackward0>)
output.loss.backward()
My transformers version is 4.43.3 in case that matters.
I cannot reproduce this. Since I don't know what data you used, I'm using some dummy data:
import torch from transformers import AutoModelForCausalLM, AutoTokenizer from peft import get_peft_model, PromptEncoderReparameterizationType, PromptEncoderConfig, TaskType model_id = "THUDM/glm-4-9b-chat" base_model = AutoModelForCausalLM.from_pretrained( model_id, low_cpu_mem_usage=True, device_map=0, trust_remote_code=True ) tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) tokenizer.pad_token = tokenizer.eos_token inputs = tokenizer("hello world", return_tensors="pt").to(0) config = PromptEncoderConfig( task_type=TaskType.TOKEN_CLS, num_virtual_tokens=10, encoder_reparameterization_type=PromptEncoderReparameterizationType.MLP, encoder_dropout=0.1, encoder_num_layers=4, encoder_hidden_size=4096, ) model = get_peft_model(base_model, config) labels = torch.nn.functional.pad(inputs["input_ids"], (0, config.num_virtual_tokens), value=-100).to(0) output = model(input_ids=inputs["input_ids"], attention_mask=inputs["attention_mask"], labels=labels) print(type(output)) # prints: <class 'transformers.modeling_outputs.CausalLMOutputWithPast'> print(output.loss) # prints: tensor(13., device='cuda:0', dtype=torch.bfloat16, grad_fn=<ToCopyBackward0>) output.loss.backward()My transformers version is 4.43.3 in case that matters.
my data like this: with O(O is the labbel)
my data like this: with O(O is the labbel)
Sorry, I don't understand this.
my data like this: with O(O is the labbel)
Sorry, I don't understand this.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PromptEncoderConfig, TaskType, get_peft_model, PromptEncoderReparameterizationType
import numpy as np
import matplotlib.pyplot as plt
import os
# 启用离线模式
os.environ["TRANSFORMERS_OFFLINE"] = "1"
# 读取训练数据
with open('./final_train.txt', 'r') as file:
train_data = file.readlines()
train_texts = []
train_labels = []
invalid_lines_count = 0
for line in train_data:
if line.strip():
parts = line.strip().split("\t")
if len(parts) == 2:
word, label = parts
if len(word) == 1 and not word.isalnum():
train_texts.append(word)
train_labels.append("O")
else:
train_texts.append(word)
train_labels.append(label)
else:
invalid_lines_count += 1
print(f"Number of invalid lines: {invalid_lines_count}")
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
tokenizer = AutoTokenizer.from_pretrained('/data/aim_nuist/aim_zhujj/xinjian/glm4_lora/ZhipuAI/glm-4-9b-chat',use_fast=False,
trust_remote_code=True)
base_model = AutoModelForCausalLM.from_pretrained(
'/data/aim_nuist/aim_zhujj/xinjian/glm4_lora/ZhipuAI/glm-4-9b-chat',
torch_dtype=torch.bfloat16,
trust_remote_code=True,
revision="specific_version"
).to(device).eval()
config = PromptEncoderConfig(
task_type=TaskType.CAUSAL_LM, num_virtual_tokens=10,
encoder_reparameterization_type=PromptEncoderReparameterizationType.MLP,
encoder_dropout=0.1, encoder_num_layers=4, encoder_hidden_size=1024)
model = get_peft_model(base_model, config)
# 构建微调任务
train_texts = ['''Generate BIO tags for each word in the given paragraph,. The BIO format uses the following labels:
• B: Beginning of an entity
• I: Inside of an entity
• O: Outside of an entity
Please extract all chemicals, genes, and diseases mentioned in the paragraph. Provide the output in the format <word> - <BIO tag>, where each word is followed by its corresponding BIO tag.
''' + text for text in train_texts]
train_encodings = tokenizer(train_texts, truncation=True, padding=True, return_tensors="pt", max_length=256)
train_labels_encodings = tokenizer(train_labels, truncation=True, padding=True, return_tensors="pt", max_length=256)
tokenizer.pad_token = tokenizer.eos_token
class Dataset(torch.utils.data.Dataset):
def __init__(self, encodings, labels_encodings):
self.encodings = encodings
self.labels_encodings = labels_encodings
def __getitem__(self, idx):
item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
item['labels'] = torch.tensor(self.labels_encodings['input_ids'][idx])
return item
def __len__(self):
return len(self.encodings.input_ids)
train_dataset = Dataset(train_encodings, train_labels_encodings)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=16, shuffle=True)
optimizer = torch.optim.AdamW(model.parameters(), lr=5e-5)
model.to(device)
epochs = 2
train_losses = []
for epoch in range(epochs):
model.train()
epoch_loss = 0
for batch in train_loader:
optimizer.zero_grad()
input_ids = batch["input_ids"].to(device)
attention_mask = batch["attention_mask"].to(device)
labels = batch["labels"].to(device)
max_seq_length = input_ids.shape[1]
padded_labels = torch.nn.functional.pad(labels, (0, max_seq_length - labels.shape[1]), value=-100).to(device)
# Debugging outputs
print(f"Batch size: {input_ids.size(0)}")
print(f"Max sequence length: {max_seq_length}")
print(f"Input IDs: {input_ids}")
print(f"Attention Mask: {attention_mask}")
print(f"Padded Labels: {padded_labels}")
# Check for None
if input_ids is None:
raise ValueError("input_ids is None")
if attention_mask is None:
raise ValueError("attention_mask is None")
if padded_labels is None:
raise ValueError("padded_labels is None")
# Check types
print(f"Type of input_ids: {type(input_ids)}")
print(f"Type of attention_mask: {type(attention_mask)}")
print(f"Type of padded_labels: {type(padded_labels)}")
# Check shapes
print(f"Shape of input_ids: {input_ids.shape}")
print(f"Shape of attention_mask: {attention_mask.shape}")
print(f"Shape of padded_labels: {padded_labels.shape}")
# Forward pass
try:
outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=padded_labels)
loss = outputs.loss
print(f"Loss: {loss.item()}")
except Exception as e:
print(f"Error during model forward pass: {e}")
print(f"input_ids: {input_ids}")
print(f"attention_mask: {attention_mask}")
print(f"padded_labels: {padded_labels}")
# Inspect model layers
for name, param in model.named_parameters():
if param is None:
print(f"Layer {name} has None as its parameter.")
# Inspect outputs
if 'outputs' in locals():
print(f"Outputs: {outputs}")
if outputs is not None:
print(f"Outputs type: {type(outputs)}")
if hasattr(outputs, 'loss'):
print(f"Outputs.loss shape: {outputs.loss.shape}")
else:
print("Outputs are not defined")
raise
loss.backward()
optimizer.step()
epoch_loss += loss.item()
train_losses.append(epoch_loss / len(train_loader))
plt.plot(np.arange(1, epochs + 1), train_losses, label="Training Loss")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.title("Training Loss Curve")
plt.legend()
plt.savefig("training_loss_curve.png")
torch.save(model.state_dict(), "/data/aim_nuist/aim_zhujj/xinjian/glm_bc2_pt_model.pt")
there is all my code,i can't find the difference between your code and my
I got your code working using THUDM/glm-4-9b-chat and the data you attached earlier.
In addition to the code change discussed above, I had to change these lines:
https://huggingface.co/THUDM/glm-4-9b-chat/blob/c24133cef34ff7a7010f1e97c113effdead0966b/modeling_chatglm.py#L880-L882
if full_attention_mask is None:
if (attention_mask is not None and not attention_mask.all()) or (past_key_values and seq_length != 1):
fake_ids = torch.zeros(batch_size, seq_length, dtype=torch.long, device=inputs_embeds.device) # <=
full_attention_mask = self.get_masks(fake_ids, past_key_values, padding_mask=attention_mask) # <=
However, I never saw the error you reported:
AttributeError: 'BaseModelOutputWithPast' object has no attribute 'loss'
Could this be because you're using a slightly different model?
Submitted, follow https://huggingface.co/THUDM/glm-4-9b-chat/discussions/74
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.