Easy-Transformer [Bug Report] GatedMLP not in docs.

[Bug Report] GatedMLP not in docs.

Open jbloomAus opened this issue 1 year ago • 3 comments

Describe the bug

We added gated mlps when we provided LLama support (https://github.com/neelnanda-io/TransformerLens/commit/3d03ca5081ff0b7a920ffe7830e2c3da0e6e9d07) however we didn't update the docs or add tests specifically for the GatedMLP component. It's on me for not catching it however it would be really nice if someone could please:

[ ] 1. update the hooked transformer config docstring to explain the gated mlp arg
[ ] 2. add tests that verify the gated MLP works (it does, but we should have tests) and when run with hooked activation cache that the cached activations are correct.
[ ] 3. Optional: Add some visualization or a tutorial around interpreting gated MLP neurons/activations.

@0amp If you have time, this might be easy for you since you have the most context. Thanks for this btw, I almost made a card to add this but you'd done it already!

Additional context

https://arxiv.org/pdf/2002.05202.pdf

Checklist

[x] I have checked that there is no similar issue in the repo (required)

May 03 '23 03:05 jbloomAus

I can do this, but probably in a couple weeks or so.

May 07 '23 03:05 0amp

Hi @0amp I'm doubting whether a one-liner for (2) is sufficient. Specifically, how is W_gate populated in current unit test(s)? With this info, I could probably then do task (1) (despite not understanding the context paper referred to above)

Nov 20 '23 16:11 danlaudk

Hey @danlaudk , I think this would require creating a new test that checks that an equivalent gated MLP in pytorch gets the same result up to torch.allclose.

Nov 21 '23 19:11 0amp

Easy-Transformer Easy-Transformer copied to clipboard

[Bug Report] GatedMLP not in docs.

Checklist

Easy-Transformer
Easy-Transformer copied to clipboard