transformers-tutorials T5 fine-tuning for summarization decoder_input

T5 fine-tuning for summarization decoder_input_ids and labels

Open marcoabrate opened this issue 5 years ago • 3 comments

hello @abhimishra91

i was trying to implement the fine tuning of T5 as explained in your notebook. in addition to have implemented the same structure as you, i have made some experiments with the HuggingFace Trainer class. the decoder_input_ids and labels parameters are not very clear to me. when you train the model, you do this

y = data['target_ids'].to(device, dtype = torch.long)
y_ids = y[:, :-1].contiguous()
lm_labels = y[:, 1:].clone().detach()
lm_labels[y[:, 1:] == tokenizer.pad_token_id] = -100

where y_ids is the decoder_input_ids. i don't understand why we need these preprocessing. i kindly ask you why are you skipping the last token of the target_ids, and why are you replacing the pads with -100 in the labels? when i use the HuggingFace Trainer i need to tweak the __getitem__ function of the DataLoader like this

def __getitem__(self, idx):

    ...

    item['decoder_input_ids'] = y[:-1]
    lbl = y[:-1].clone()
    lbl[y[1:] == self.tokenizer.pad_token_id] = -100
    item['labels'] = lbl

    return item

otherwise the loss function does not decrease over time.

thank you for your help!

Oct 13 '20 10:10 marcoabrate

IMG_20201228_185453_834 Hi, @marcoabrate! I am also having trouble calculating loss. Can you share the full code for your training? Have you used multiGPU?

Dec 28 '20 15:12 Gorodecki

Hi @Gorodecki I have abandoned this code since there are a lot of seq2seq training and testing examples in the HuggingFace library itself, you can check them out here: https://github.com/huggingface/transformers/tree/master/examples/seq2seq I was not using multiGPU. Hope this help!

Jan 03 '21 09:01 marcoabrate

@Gorodecki , @marcoabrate : so far I have found this one useful: https://github.com/huggingface/notebooks/blob/master/examples/summarization.ipynb

May 10 '21 05:05 QuetzalcoatlRosso

transformers-tutorials transformers-tutorials copied to clipboard

T5 fine-tuning for summarization decoder_input_ids and labels

transformers-tutorials
transformers-tutorials copied to clipboard