peft icon indicating copy to clipboard operation
peft copied to clipboard

AttributeError: 'NoneType' object has no attribute 'shape'

Open jjzhu0579 opened this issue 1 year ago • 14 comments

System Info

python 3.10

Who can help?

No response

Information

  • [ ] The official example scripts
  • [X] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder
  • [X] My own task or dataset (give details below)

Reproduction

train_losses = []
for epoch in range(epochs):
    model.train()
    epoch_loss = 0
    for batch in train_loader:
        optimizer.zero_grad()
        input_ids = batch["input_ids"].to(device)
        print("Input IDs:", input_ids)  # 添加此行打印 input_ids
        attention_mask = batch["attention_mask"].to(device)
        labels = batch["labels"].to(device)
        max_seq_length = input_ids.shape[1]
        padded_labels = torch.nn.functional.pad(labels, (0, max_seq_length - labels.shape[1]), value=-100).to(device)
        print('------------------------------')
        print("Input IDs size:", input_ids.size)
        print(f"Attention Mask size: {attention_mask.size}")
        print(f"Padded Labels size: {padded_labels.size}")
        if input_ids is None:
            raise ValueError("input_ids is None")
        if attention_mask is None:
            raise ValueError("attention_mask is None")
        if padded_labels is None:
            raise ValueError("padded_labels is None")
        print("Type of input_ids:", type(input_ids))
        print("Type of attention_mask:", type(attention_mask))
        print("Type of padded_labels:", type(padded_labels))
        print(model.forward)
        from peft import PeftModelForSequenceClassification
        c
        print(PeftModelForSequenceClassification.forward)

        outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=padded_labels)
        loss = outputs.loss
        loss.backward()
        optimizer.step()
        epoch_loss += loss.item()
    train_losses.append(epoch_loss / len(train_loader))

Expected behavior

Traceback (most recent call last):
  File "/share/home/aim/aim_zhujj/bc2/glm_bc2_pt.py", line 114, in <module>
    outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=padded_labels)
  File "/data/aim_nuist/aim_zhujj/.conda/envs/pytorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data/aim_nuist/aim_zhujj/.conda/envs/pytorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/aim_nuist/aim_zhujj/.conda/envs/pytorch/lib/python3.10/site-packages/peft/peft_model.py", line 1283, in forward
    return self.base_model(inputs_embeds=inputs_embeds, **kwargs)
  File "/data/aim_nuist/aim_zhujj/.conda/envs/pytorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data/aim_nuist/aim_zhujj/.conda/envs/pytorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/share/home/aim/aim_zhujj/.cache/huggingface/modules/transformers_modules/glm-4-9b-chat/modeling_chatglm.py", line 878, in forward
    transformer_outputs = self.transformer(
  File "/data/aim_nuist/aim_zhujj/.conda/envs/pytorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data/aim_nuist/aim_zhujj/.conda/envs/pytorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/share/home/aim/aim_zhujj/.cache/huggingface/modules/transformers_modules/glm-4-9b-chat/modeling_chatglm.py", line 757, in forward
    batch_size, seq_length = input_ids.shape
AttributeError: 'NoneType' object has no attribute 'shape'

jjzhu0579 avatar Jul 30 '24 10:07 jjzhu0579

Could you please provide the code that loads the base model and then applies the PEFT model on top?

BenjaminBossan avatar Jul 30 '24 10:07 BenjaminBossan

您能否提供加载基本模型然后在顶部应用 PEFT 模型的代码?

tokenizer = AutoTokenizer.from_pretrained('/data/aim_nuist/aim_zhujj/xinjian/glm4_lora/ZhipuAI/glm-4-9b-chat', trust_remote_code=True) base_model = AutoModelForCausalLM.from_pretrained( '/data/aim_nuist/aim_zhujj/xinjian/glm4_lora/ZhipuAI/glm-4-9b-chat', low_cpu_mem_usage=True, trust_remote_code=True ).to(device).eval()

tokenizer.pad_token = tokenizer.eos_token

config = PromptEncoderConfig( task_type=TaskType.TOKEN_CLS, num_virtual_tokens=10, encoder_reparameterization_type=PromptEncoderReparameterizationType.MLP, encoder_dropout=0.1, encoder_num_layers=4, encoder_hidden_size=4096) model = get_peft_model(base_model, config)

jjzhu0579 avatar Jul 30 '24 10:07 jjzhu0579

Thanks for the additional details. I could reproduce the error using the model THUDM/glm-4-9b-chat. The issue is that this model uses custom code, which is not compatible with PromptEncoder. As the error indicates, this line fails:

batch_size, seq_length = input_ids.shape AttributeError: 'NoneType' object has no attribute 'shape'

This is because the prompt encoder does not pass input_ids, instead passing the inputs_embeds directly. I could patch the issue by using inputs_embeds instead by editing this line:

https://huggingface.co/THUDM/glm-4-9b-chat/blob/c24133cef34ff7a7010f1e97c113effdead0966b/modeling_chatglm.py#L875

        if input_ids is not None:
            batch_size, seq_length = input_ids.shape
        else:
            batch_size, seq_length, _ = inputs_embeds.shape

Not sure if that's an option for you or not. Maybe you could also create a PR on the ChatGLM repo to suggest this change.

BenjaminBossan avatar Jul 30 '24 12:07 BenjaminBossan

Thanks for the additional details. I could reproduce the error using the model THUDM/glm-4-9b-chat. The issue is that this model uses custom code, which is not compatible with PromptEncoder. As the error indicates, this line fails:

batch_size, seq_length = input_ids.shape AttributeError: 'NoneType' object has no attribute 'shape'

This is because the prompt encoder does not pass input_ids, instead passing the inputs_embeds directly. I could patch the issue by using inputs_embeds instead by editing this line:

https://huggingface.co/THUDM/glm-4-9b-chat/blob/c24133cef34ff7a7010f1e97c113effdead0966b/modeling_chatglm.py#L875

        if input_ids is not None:
            batch_size, seq_length = input_ids.shape
        else:
            batch_size, seq_length, _ = inputs_embeds.shape

Not sure if that's an option for you or not. Maybe you could also create a PR on the ChatGLM repo to suggest this change.

thank you ,if this line should change? outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=padded_labels)

jjzhu0579 avatar Jul 30 '24 13:07 jjzhu0579

if this line should change? outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=padded_labels)

No, this line can stay as is. PEFT will handle the extension of the embeddings internally.

BenjaminBossan avatar Jul 30 '24 13:07 BenjaminBossan

if this line should change? outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=padded_labels)

No, this line can stay as is. PEFT will handle the extension of the embeddings internally.

When I change this code(batch_size, seq_length = input_ids.shape), it always automatically reverts from hugging face to before the changeWhen I change this code, it always automatically reverts from hugging face to before the change

jjzhu0579 avatar Jul 30 '24 13:07 jjzhu0579

When I change this code, it always automatically reverts from hugging face to before the change

You mean the code in modeling_chatglm.py? Maybe there is a logic to check if the file needs updating, and then it is re-downloaded, overwriting the change. Instead, you could try monkey patching the forward method with a custom method that includes the suggested change.

BenjaminBossan avatar Jul 30 '24 13:07 BenjaminBossan

you could try monkey patching the forward method with a custom method that includes the suggested change.

sorry,there is an new issue: Traceback (most recent call last): File "/share/home/aim/aim_zhujj/bc2/glm_bc2_pt.py", line 141, in loss = outputs.loss AttributeError: 'BaseModelOutputWithPast' object has no attribute 'loss'

jjzhu0579 avatar Jul 30 '24 14:07 jjzhu0579

I cannot reproduce this. Since I don't know what data you used, I'm using some dummy data:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import get_peft_model, PromptEncoderReparameterizationType, PromptEncoderConfig, TaskType

model_id = "THUDM/glm-4-9b-chat"
base_model = AutoModelForCausalLM.from_pretrained(
    model_id, low_cpu_mem_usage=True, device_map=0, trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
inputs = tokenizer("hello world", return_tensors="pt").to(0)
config = PromptEncoderConfig(
    task_type=TaskType.TOKEN_CLS,
    num_virtual_tokens=10,
    encoder_reparameterization_type=PromptEncoderReparameterizationType.MLP,
    encoder_dropout=0.1,
    encoder_num_layers=4,
    encoder_hidden_size=4096,
)
model = get_peft_model(base_model, config)

labels = torch.nn.functional.pad(inputs["input_ids"], (0, config.num_virtual_tokens), value=-100).to(0)
output = model(input_ids=inputs["input_ids"], attention_mask=inputs["attention_mask"], labels=labels)
print(type(output))
# prints: <class 'transformers.modeling_outputs.CausalLMOutputWithPast'>
print(output.loss)
# prints: tensor(13., device='cuda:0', dtype=torch.bfloat16, grad_fn=<ToCopyBackward0>)
output.loss.backward()

My transformers version is 4.43.3 in case that matters.

BenjaminBossan avatar Jul 30 '24 14:07 BenjaminBossan

I cannot reproduce this. Since I don't know what data you used, I'm using some dummy data:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import get_peft_model, PromptEncoderReparameterizationType, PromptEncoderConfig, TaskType

model_id = "THUDM/glm-4-9b-chat"
base_model = AutoModelForCausalLM.from_pretrained(
    model_id, low_cpu_mem_usage=True, device_map=0, trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
inputs = tokenizer("hello world", return_tensors="pt").to(0)
config = PromptEncoderConfig(
    task_type=TaskType.TOKEN_CLS,
    num_virtual_tokens=10,
    encoder_reparameterization_type=PromptEncoderReparameterizationType.MLP,
    encoder_dropout=0.1,
    encoder_num_layers=4,
    encoder_hidden_size=4096,
)
model = get_peft_model(base_model, config)

labels = torch.nn.functional.pad(inputs["input_ids"], (0, config.num_virtual_tokens), value=-100).to(0)
output = model(input_ids=inputs["input_ids"], attention_mask=inputs["attention_mask"], labels=labels)
print(type(output))
# prints: <class 'transformers.modeling_outputs.CausalLMOutputWithPast'>
print(output.loss)
# prints: tensor(13., device='cuda:0', dtype=torch.bfloat16, grad_fn=<ToCopyBackward0>)
output.loss.backward()

My transformers version is 4.43.3 in case that matters.

my data like this: with O(O is the labbel)

jjzhu0579 avatar Jul 30 '24 14:07 jjzhu0579

my data like this: with O(O is the labbel)

Sorry, I don't understand this.

BenjaminBossan avatar Jul 30 '24 16:07 BenjaminBossan

my data like this: with O(O is the labbel)

Sorry, I don't understand this.

test.txt

jjzhu0579 avatar Jul 30 '24 16:07 jjzhu0579

my data like this: with O(O is the labbel)

Sorry, I don't understand this.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PromptEncoderConfig, TaskType, get_peft_model, PromptEncoderReparameterizationType
import numpy as np
import matplotlib.pyplot as plt
import os





# 启用离线模式
os.environ["TRANSFORMERS_OFFLINE"] = "1"
# 读取训练数据
with open('./final_train.txt', 'r') as file:
    train_data = file.readlines()
train_texts = []
train_labels = []
invalid_lines_count = 0

for line in train_data:
    if line.strip():
        parts = line.strip().split("\t")
        if len(parts) == 2:
            word, label = parts
            if len(word) == 1 and not word.isalnum():
                train_texts.append(word)
                train_labels.append("O")
            else:
                train_texts.append(word)
                train_labels.append(label)
        else:
            invalid_lines_count += 1

print(f"Number of invalid lines: {invalid_lines_count}")

device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
tokenizer = AutoTokenizer.from_pretrained('/data/aim_nuist/aim_zhujj/xinjian/glm4_lora/ZhipuAI/glm-4-9b-chat',use_fast=False,
                                          trust_remote_code=True)
base_model = AutoModelForCausalLM.from_pretrained(
    '/data/aim_nuist/aim_zhujj/xinjian/glm4_lora/ZhipuAI/glm-4-9b-chat',
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    revision="specific_version"
).to(device).eval()

config = PromptEncoderConfig(
    task_type=TaskType.CAUSAL_LM, num_virtual_tokens=10,
    encoder_reparameterization_type=PromptEncoderReparameterizationType.MLP,
    encoder_dropout=0.1, encoder_num_layers=4, encoder_hidden_size=1024)
model = get_peft_model(base_model, config)

# 构建微调任务
train_texts = ['''Generate BIO tags for each word in the given paragraph,. The BIO format uses the following labels:
•    B: Beginning of an entity
•    I: Inside of an entity
•    O: Outside of an entity
Please extract all chemicals, genes, and diseases mentioned in the paragraph. Provide the output in the format <word> - <BIO tag>, where each word is followed by its corresponding BIO tag.
''' + text for text in train_texts]

train_encodings = tokenizer(train_texts, truncation=True, padding=True, return_tensors="pt", max_length=256)
train_labels_encodings = tokenizer(train_labels, truncation=True, padding=True, return_tensors="pt", max_length=256)
tokenizer.pad_token = tokenizer.eos_token

class Dataset(torch.utils.data.Dataset):
    def __init__(self, encodings, labels_encodings):
        self.encodings = encodings
        self.labels_encodings = labels_encodings

    def __getitem__(self, idx):
        item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
        item['labels'] = torch.tensor(self.labels_encodings['input_ids'][idx])
        return item

    def __len__(self):
        return len(self.encodings.input_ids)


train_dataset = Dataset(train_encodings, train_labels_encodings)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=16, shuffle=True)

optimizer = torch.optim.AdamW(model.parameters(), lr=5e-5)
model.to(device)

epochs = 2
train_losses = []
for epoch in range(epochs):
    model.train()
    epoch_loss = 0
    for batch in train_loader:
        optimizer.zero_grad()
        input_ids = batch["input_ids"].to(device)
        attention_mask = batch["attention_mask"].to(device)
        labels = batch["labels"].to(device)
        max_seq_length = input_ids.shape[1]
        padded_labels = torch.nn.functional.pad(labels, (0, max_seq_length - labels.shape[1]), value=-100).to(device)

        # Debugging outputs
        print(f"Batch size: {input_ids.size(0)}")
        print(f"Max sequence length: {max_seq_length}")
        print(f"Input IDs: {input_ids}")
        print(f"Attention Mask: {attention_mask}")
        print(f"Padded Labels: {padded_labels}")

        # Check for None
        if input_ids is None:
            raise ValueError("input_ids is None")
        if attention_mask is None:
            raise ValueError("attention_mask is None")
        if padded_labels is None:
            raise ValueError("padded_labels is None")

        # Check types
        print(f"Type of input_ids: {type(input_ids)}")
        print(f"Type of attention_mask: {type(attention_mask)}")
        print(f"Type of padded_labels: {type(padded_labels)}")

        # Check shapes
        print(f"Shape of input_ids: {input_ids.shape}")
        print(f"Shape of attention_mask: {attention_mask.shape}")
        print(f"Shape of padded_labels: {padded_labels.shape}")

        # Forward pass
        try:
            outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=padded_labels)
            loss = outputs.loss
            print(f"Loss: {loss.item()}")
        except Exception as e:
            print(f"Error during model forward pass: {e}")
            print(f"input_ids: {input_ids}")
            print(f"attention_mask: {attention_mask}")
            print(f"padded_labels: {padded_labels}")

            # Inspect model layers
            for name, param in model.named_parameters():
                if param is None:
                    print(f"Layer {name} has None as its parameter.")

            # Inspect outputs
            if 'outputs' in locals():
                print(f"Outputs: {outputs}")
                if outputs is not None:
                    print(f"Outputs type: {type(outputs)}")
                    if hasattr(outputs, 'loss'):
                        print(f"Outputs.loss shape: {outputs.loss.shape}")
            else:
                print("Outputs are not defined")

            raise

        loss.backward()
        optimizer.step()
        epoch_loss += loss.item()
    train_losses.append(epoch_loss / len(train_loader))

plt.plot(np.arange(1, epochs + 1), train_losses, label="Training Loss")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.title("Training Loss Curve")
plt.legend()
plt.savefig("training_loss_curve.png")

torch.save(model.state_dict(), "/data/aim_nuist/aim_zhujj/xinjian/glm_bc2_pt_model.pt")

there is all my code,i can't find the difference between your code and my

jjzhu0579 avatar Jul 31 '24 01:07 jjzhu0579

I got your code working using THUDM/glm-4-9b-chat and the data you attached earlier.

In addition to the code change discussed above, I had to change these lines:

https://huggingface.co/THUDM/glm-4-9b-chat/blob/c24133cef34ff7a7010f1e97c113effdead0966b/modeling_chatglm.py#L880-L882

        if full_attention_mask is None:
            if (attention_mask is not None and not attention_mask.all()) or (past_key_values and seq_length != 1):
                fake_ids = torch.zeros(batch_size, seq_length, dtype=torch.long, device=inputs_embeds.device) # <=
                full_attention_mask = self.get_masks(fake_ids, past_key_values, padding_mask=attention_mask)  # <=

However, I never saw the error you reported:

AttributeError: 'BaseModelOutputWithPast' object has no attribute 'loss'

Could this be because you're using a slightly different model?

BenjaminBossan avatar Jul 31 '24 09:07 BenjaminBossan

Submitted, follow https://huggingface.co/THUDM/glm-4-9b-chat/discussions/74

spianmo avatar Aug 25 '24 17:08 spianmo

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

github-actions[bot] avatar Sep 19 '24 15:09 github-actions[bot]