LLaMA-Pro Should I freeze norm.weight?

Should I freeze norm.weight?

Open metterian opened this issue 1 year ago • 1 comments

trafficstars

Llama base model has norm.weight.

Did you also freeze norm.weight when post-training?

    "model.layers.9.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
    "model.layers.9.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
    "model.layers.9.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
    "model.norm.weight": "model-00004-of-00004.safetensors"

Jan 28 '24 15:01 metterian

We freeze all the weights of the initial llama model and only train the newly added blocks.

Feb 07 '24 13:02 hills-code

LLaMA-Pro LLaMA-Pro copied to clipboard

Should I freeze norm.weight?

LLaMA-Pro
LLaMA-Pro copied to clipboard