DeepSpeedExamples
DeepSpeedExamples copied to clipboard
[PROBLEM] DeepSpeedChat Create HF Model FOR LLAMA Token ID Question
Hello, I wish you good work.
I got stuck at a point here and wanted to get an answer from you. When we first set up the tokenizer structure, these were our token information for the OPT Models.
OPT TOKEN ID:
{'bos_token': '</s>',
'eos_token': '</s>',
'unk_token': '</s>',
'pad_token': '</s>'}
And then in "create_hf_model" when calling the model;
model.config.eos_token_id = tokenizer.eos_token_id
model.config.pad_token_id = model.config.eos_token_id
as we equate.
But as far as I normally see in LLama Models, pad_token is not included and we add it later, in the "load_hf_tokenizer" section.
In short;
LLAMA TOKEN ID BEFORE ADD PAD TOKEN ID:
{'bos_token': '<s>',
'eos_token': '</s>',
'unk_token': '<unk>'}
...
tokenizer.add_special_tokens({"pad_token" : "[PAD]"})
.....
But while we are calling the LLAMA model in the "create_hf_model" section in this model section, it doesn't seem to apply our change in the pad_token section.
model.config.eos_token_id = tokenizer.eos_token_id ----> </s>
model.config.pad_token_id = model.config.eos_token_id --> </s>
Isn't that what it should be?
as model.config.pad_token_id = tokenizer.pad_token_id--> [PAD]
doesn't it need to be set?
Can you explain if there is a different approach for the MODEL TOKEN ID and TOKENIZER ID?
@awan-10
@syngokhan - there's no need to focus on pad token, it can be anything. the pad embedding will never affect the output.