Resizing model embedding when loading the model
In create_hf_model, what's the purpose of resizing the model embedding?
model.config.end_token_id = tokenizer.eos_token_id
44 | model.config.pad_token_id = model.config.eos_token_id 45 | model.resize_token_embeddings(int( 46 | 8 * 47 | math.ceil(len(tokenizer) / 8.0))) # make the vocab size multiple of 8
@yaozhewei - please respond to this question.
Looking forward to any replies.
This resizing not only leads to the warning You are resizing the embedding layer without providing a pad_to_multiple_of parameter. This means that the new embedding dimension will be 32008. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
but also increase the memory consumption when stage=2/3.
we usually resize the model when using add speical tokens