Carson Lam comments

Repositories
Issues
Comments

Results 2 comments of


Carson Lam

lm_head and v_head, why re-initialize and why dropout?

So I did some research on my own and basically my first 2 questions can be answered by looking at the huggingface transformers repository: https://github.com/huggingface/transformers/blob/main/src/transformers/modeling_utils.py

lm_head and v_head, why re-initialize and why dropout?

@danjohnvelasco as long as you use the same name `self.lm_head`, when you load the pretrained model from the dictionary of parameters, these linear parameters will be replaced with the trained...