Hydra icon indicating copy to clipboard operation
Hydra copied to clipboard

Weights are shared across the MLP layers

Open 0seba opened this issue 4 months ago • 1 comments

See this https://github.com/linkedin/Liger-Kernel/pull/269 . Confirmed that weights are shared for vicuna 7b Screenshot 2024-10-02 at 15 59 41

~~Also, for some reason I couldn't find, not all layers have an res_connection linear layer~~ Screenshot 2024-10-02 at 16 02 41

Finally, from the same screenshot, the prefix_embeding_layer has an unused prefix_embeding_layer.embed_tokens.weight

0seba avatar Oct 02 '24 19:10 0seba