fairseq2 icon indicating copy to clipboard operation
fairseq2 copied to clipboard

LLaMA 3.2 3B: Tied weights missing

Open heffernankevin opened this issue 8 months ago • 0 comments

Describe the bug: Currently the config for LLaMA 3.2 3B is not using tied weights. Only the 1B model is currently supported (https://github.com/facebookresearch/fairseq2/blob/main/src/fairseq2/models/llama/_config.py#L257)

Describe how to reproduce: Loaded LLaMA 3.2 3B and confirmed weights are tied

In [13]: (model.decoder_frontend.embed.weight == model.final_proj.weight).all()
Out[13]: tensor(True)

heffernankevin avatar Apr 03 '25 11:04 heffernankevin