LLaMA-Pro
LLaMA-Pro copied to clipboard
Should I freeze norm.weight?
trafficstars
Llama base model has norm.weight.
Did you also freeze norm.weight when post-training?
"model.layers.9.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.norm.weight": "model-00004-of-00004.safetensors"
We freeze all the weights of the initial llama model and only train the newly added blocks.