finetune-transformer-lm
finetune-transformer-lm copied to clipboard
Is the LM pre-trained with a _start_ symbol?
Hi,
I was wondering if, during the pre-training of the LM alone, the sentences were prepended with a start symbol, just like they are during fine-tuning. If that's the case, could you please mention what is the name of the token in the learned vocab? If that's not the case, then wouldn't it introduce a bit of a mismatch wrt the pre-trained LM? Of course, the model is being fine-tuned so it will adapt, but then why would the start symbol be necessary for fine-tuning? Thanks!