DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

Resizing model embedding when loading the model

Open puyuanOT opened this issue 2 years ago • 3 comments

In create_hf_model, what's the purpose of resizing the model embedding?

model.config.end_token_id = tokenizer.eos_token_id

44 | model.config.pad_token_id = model.config.eos_token_id 45 | model.resize_token_embeddings(int( 46 | 8 * 47 | math.ceil(len(tokenizer) / 8.0))) # make the vocab size multiple of 8

puyuanOT avatar Sep 11 '23 22:09 puyuanOT

@yaozhewei - please respond to this question.

awan-10 avatar Sep 12 '23 16:09 awan-10

Looking forward to any replies.

This resizing not only leads to the warning You are resizing the embedding layer without providing a pad_to_multiple_of parameter. This means that the new embedding dimension will be 32008. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc but also increase the memory consumption when stage=2/3.

puyuanOT avatar Sep 12 '23 21:09 puyuanOT

we usually resize the model when using add speical tokens

SupercarryNg avatar Nov 07 '23 09:11 SupercarryNg