transformers icon indicating copy to clipboard operation
transformers copied to clipboard

TypeError: _forward_unimplemented() got an unexpected keyword argument 'input_ids'

Open QuantumStatic opened this issue 2 years ago • 2 comments

System Info

  • transformers version: 4.24.0
  • Platform: Windows-10-10.0.19044-SP0
  • Python version: 3.10.8
  • Huggingface_hub version: 0.11.0
  • PyTorch version (GPU?): 1.13.0+cu117 (True)

Who can help?

@ArthurZucker and @younesbelkada since I am using distilbert-base-uncased
(and maybe @sgugger, since I am following this link) on the hugging face website

Information

  • [x] The official example scripts
  • [X] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [X] My own task or dataset (give details below)

Reproduction

I am using a custom dataset to fine tune distilbert-base-uncased. I followed the method described on the hugging face wesbite to the T. Here is my code for making the dataset.

def create_hugging_face_dataset(data:dict):
    train_text, test_text, train_label, test_label = train_test_split(data['text'], data['label'], test_size=0.1, shuffle=True)
    train_text, validation_text, train_label, validation_label = train_test_split(train_text, train_label, test_size=0.1, shuffle=True)

    tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-uncased')

    train_encodings = tokenizer(train_text, truncation=True, padding=True)
    test_encodings = tokenizer(test_text, truncation=True, padding=True)
    validation_encodings = tokenizer(validation_text, truncation=True, padding=True)

    class MBICDataset(torch.utils.data.Dataset):
        def __init__(self, encodings, labels):
            self.encodings = encodings
            self.labels = labels

        def __getitem__(self, idx):
            item = {key: torch.Tensor(val[idx]) for key, val in self.encodings.items()}
            item['labels'] = torch.Tensor(self.labels[idx])
            return item

        def __len__(self):
            return len(self.labels)

    train_ds = MBICDataset(train_encodings, train_label)
    test_ds = MBICDataset(test_encodings, test_label)
    validation_ds = MBICDataset(validation_encodings, validation_label)


    FINAL_DS = {"train":train_ds, "test":test_ds, "validation":validation_ds}

After making the dataset I try to fine-tune the model using the following code.


tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-uncased')

training_stuff = {
    "batch_size": 64, 
    "epochs": 4, 
    "learning_rate": 1e-5,
    "weight_decay": 0.01
    }

training_args = TrainingArguments(
            output_dir="C:/Users/uujain2/Desktop/Utkarsh/FYP/Models/DistilBert",
            per_device_train_batch_size=training_stuff["batch_size"],
            evaluation_strategy="steps",
            num_train_epochs=training_stuff["epochs"],
            fp16=True,
            save_steps=100,
            eval_steps=50,
            logging_steps=10,
            weight_decay=training_stuff["weight_decay"],
            learning_rate=training_stuff["learning_rate"],
            save_total_limit=64,
            remove_unused_columns=False,
            push_to_hub=False,
            report_to='tensorboard',
            load_best_model_at_end=True,
        )


model = DistilBertPreTrainedModel.from_pretrained(
    'distilbert-base-uncased',
    num_labels=3,
    id2label={0: 'Biased', 1: 'Non-biased', 2: 'No agreeemnt'},
    label2id={'Biased': 0, 'Non-biased': 1, 'No agreement': 2},
    )

trainer = Trainer(
            model=model,
            args=training_args,
            train_dataset=FINAL_DS['train'],
            eval_dataset=FINAL_DS['validation'],
            tokenizer=tokenizer,
        )

train_results = trainer.train()

However, I run into the following error.

Traceback (most recent call last):
  File "c:\Users\uujain2\Desktop\Utkarsh\FYP\Code\test.py", line 68, in <module>
    train_results = trainer.train()
  File "C:\Users\uujain2\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\trainer.py", line 1501, in train
    return inner_training_loop(
  File "C:\Users\uujain2\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\trainer.py", line 1749, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "C:\Users\uujain2\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\trainer.py", line 2508, in training_step
    loss = self.compute_loss(model, inputs)
  File "C:\Users\uujain2\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\trainer.py", line 2540, in compute_loss
    outputs = model(**inputs)
  File "C:\Users\uujain2\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
TypeError: _forward_unimplemented() got an unexpected keyword argument 'input_ids'

Expected behavior

I expect the model to start the finetuning process instead of throwing this error.

QuantumStatic avatar Jan 27 '23 11:01 QuantumStatic

DistilBertPreTrainedModel is an abstract class and shouldn't be used directly. Maybe you wanted to use DistilBertModel or DistilBertForPretraining?

sgugger avatar Jan 27 '23 13:01 sgugger

Thank you for your quick response. It was my silly mistake to use an abstract class for pre-training. I was able to import DistilBertModel, however the import for DistilBertForPretraining failed, but that's alright.

However when I try to run the model now I get the following error.

ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`labels` in this case) have excessive nesting (inputs type `list` where type `int` is expected).

I have followed the webpage titled Fine-tuning with custom datasets. My function that creates the initial lists with texts and labels is below. The data is formatted very similarly to the webpage:

def create_MBIC_data_dict() -> dict[str, str]:
    data_dict = {'text': [], 'label':[]}
    with open(f"{DATA_FOLDER_PATH}/final_labels_MBIC_new.csv") as csv_file:
        csv_reader = csv.reader(csv_file)
        line_count = 0
        for row in csv_reader:
            if line_count != 0:
                data_dict['text'].append(row[0])
                label_val = -1
                match row[7]:
                    case "Biased":
                        label_val = 1
                    case "Non-biased":
                        label_val = 0
                    case "No agreement":
                        label_val = 2
                data_dict['label'].append(label_val)
            line_count += 1

    return data_dict

Afterwards the create_hugging_face_dataset function executes which creates the dataset.

@sgugger

QuantumStatic avatar Jan 27 '23 15:01 QuantumStatic

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Feb 26 '23 15:02 github-actions[bot]