FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

the shape of params lm_head.wegiht is not compatible between base weight and delta weight

Open sewellxie opened this issue 1 year ago • 2 comments

when running apply_delta.py of weight version v1.1, I find that the the shape of params lm_head.wegiht is [32000, 5120] in base weight, and is [32001, 5120] in delta weight. However, the shape of model.embed_tokends.weight are the same for these two weights.

Have anyone met this problem before?

sewellxie avatar Apr 20 '23 05:04 sewellxie

use my llama Ejafa/llama_7B , I recovered successfully

Ejafa avatar Apr 20 '23 18:04 Ejafa

use my llama Ejafa/llama_7B , I recovered successfully

Thanks. I solved this problem and recovered 13B model by concatenaing a torch.zeros([1, 5120]) to the end of llm_head.wegiht. But I still don't know why the shapes of lm_head.wegiht and model.embed_tokends.weight are not consistent.

sewellxie avatar Apr 21 '23 02:04 sewellxie