detoxify icon indicating copy to clipboard operation
detoxify copied to clipboard

- classifier.out_proj.weight: found shape torch.Size([16, 768]) in the checkpoint and torch.Size([2, 768]) in the model instantiated

Open pratikchhapolika opened this issue 3 years ago • 1 comments

I am using your model to fine-tune on binary classification task. ( Number of classes =2) instead of 16.

My class labels are just 0 and 1

https://huggingface.co/unitary/unbiased-toxic-roberta/tree/main

I am writing the below code:

Metrics to calculate loss on binary labels as accuracy

def compute_metrics(eval_pred):
    
    logits, labels = eval_pred
   

    predictions = np.argmax(logits, axis=-1)
    
    acc = np.sum(predictions == labels) / predictions.shape[0]
    
    return {"accuracy" : acc}
model = tr.RobertaForSequenceClassification.from_pretrained("/home/pc/unbiased_toxic_roberta",num_labels=2)
model.to(device)



training_args = tr.TrainingArguments(
#     report_to = 'wandb',
    output_dir='/home/pc/1_Proj_hate_speech/results_roberta',          # output directory
    overwrite_output_dir = True,
    num_train_epochs=20,              # total number of training epochs
    per_device_train_batch_size=16,  # batch size per device during training
    per_device_eval_batch_size=32,   # batch size for evaluation
    learning_rate=2e-5,
    warmup_steps=1000,                # number of warmup steps for learning rate scheduler
    weight_decay=0.01,               # strength of weight decay
    logging_dir='./logs3',            # directory for storing logs
    logging_steps=1000,
    evaluation_strategy="epoch"
    ,save_strategy="epoch"
    ,load_best_model_at_end=True
)


trainer = tr.Trainer(
    model=model,                         # the instantiated 🤗 Transformers model to be trained
    args=training_args,                  # training arguments, defined above
    train_dataset=train_data,         # training dataset
    eval_dataset=val_data,             # evaluation dataset
    compute_metrics=compute_metrics
)

Error:

- classifier.out_proj.weight: found shape torch.Size([16, 768]) in the checkpoint and torch.Size([2, 768]) in the model instantiated
- classifier.out_proj.bias: found shape torch.Size([16]) in the checkpoint and torch.Size([2]) in the model instantiated

How can I solve this?

pratikchhapolika avatar Jan 07 '22 11:01 pratikchhapolika

Hello, and sorry for late reply! You would first need to replace the final linear layer of the model with one that has your number of classes, in your case 2.

laurahanu avatar Apr 12 '22 17:04 laurahanu