finetune-transformer-lm Is the LM pre-trained with a _start

Is the LM pre-trained with a _start_ symbol?

Open OanaMariaCamburu opened this issue 6 years ago • 0 comments

Hi,

I was wondering if, during the pre-training of the LM alone, the sentences were prepended with a start symbol, just like they are during fine-tuning. If that's the case, could you please mention what is the name of the token in the learned vocab? If that's not the case, then wouldn't it introduce a bit of a mismatch wrt the pre-trained LM? Of course, the model is being fine-tuned so it will adapt, but then why would the start symbol be necessary for fine-tuning? Thanks!

Nov 20 '18 18:11 OanaMariaCamburu

finetune-transformer-lm finetune-transformer-lm copied to clipboard

Is the LM pre-trained with a _start_ symbol?

finetune-transformer-lm
finetune-transformer-lm copied to clipboard