ml_things icon indicating copy to clipboard operation
ml_things copied to clipboard

pad token as eos token for gpt2 classification

Open qwenzo opened this issue 2 years ago • 1 comments

Hello,

First of all thanks a lot for the tutorials! I have a question regarding the pad token. In the GPT-2 for classification example, you set padding to be the eos token. Why was that the case? Shouldn't every sequence have one eos token at the end of the sequence to be passed for classification just like [CLS] in BERT?

qwenzo avatar Feb 05 '23 10:02 qwenzo

@qwenzo In Bert the [CLS] is used during it's pre-training for NSP so it has information of the entire sequence. In GPT-2 you always use the last token in the sequence to predict what comes next, so it contains the information relevant form the entire sequence.

gmihaila avatar Jun 17 '25 12:06 gmihaila