ml_things
ml_things copied to clipboard
pad token as eos token for gpt2 classification
Hello,
First of all thanks a lot for the tutorials! I have a question regarding the pad token. In the GPT-2 for classification example, you set padding to be the eos token. Why was that the case? Shouldn't every sequence have one eos token at the end of the sequence to be passed for classification just like [CLS] in BERT?
@qwenzo In Bert the [CLS] is used during it's pre-training for NSP so it has information of the entire sequence. In GPT-2 you always use the last token in the sequence to predict what comes next, so it contains the information relevant form the entire sequence.