quiet-star warning

warning

Open clarencerat opened this issue 9 months ago • 1 comments

Some weights of the model checkpoint at ezelikman/quietstar-8-ahead were not used when initializing MistralForCausalLM: ['end_embedding', 'start_embedding', 'talk_head.0.0.bias', 'talk_head.0.0.weight', 'talk_head.0.2.bias', 'talk_head.0.2.weight', 'talk_head.0.4.weight']

This IS expected if you are initializing MistralForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing MistralForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

May 17 '24 11:05 clarencerat

quiet-star quiet-star copied to clipboard

warning

quiet-star
quiet-star copied to clipboard