Easy-Transformer icon indicating copy to clipboard operation
Easy-Transformer copied to clipboard

[Bug Report] GatedMLP not in docs.

Open jbloomAus opened this issue 1 year ago • 3 comments

Describe the bug

We added gated mlps when we provided LLama support (https://github.com/neelnanda-io/TransformerLens/commit/3d03ca5081ff0b7a920ffe7830e2c3da0e6e9d07) however we didn't update the docs or add tests specifically for the GatedMLP component. It's on me for not catching it however it would be really nice if someone could please:

  • [ ] 1. update the hooked transformer config docstring to explain the gated mlp arg
  • [ ] 2. add tests that verify the gated MLP works (it does, but we should have tests) and when run with hooked activation cache that the cached activations are correct.
  • [ ] 3. Optional: Add some visualization or a tutorial around interpreting gated MLP neurons/activations.

@0amp If you have time, this might be easy for you since you have the most context. Thanks for this btw, I almost made a card to add this but you'd done it already!

Additional context

https://arxiv.org/pdf/2002.05202.pdf

Checklist

  • [x] I have checked that there is no similar issue in the repo (required)

jbloomAus avatar May 03 '23 03:05 jbloomAus

I can do this, but probably in a couple weeks or so.

0amp avatar May 07 '23 03:05 0amp

Hi @0amp I'm doubting whether a one-liner for (2) is sufficient. Specifically, how is W_gate populated in current unit test(s)? With this info, I could probably then do task (1) (despite not understanding the context paper referred to above)

danlaudk avatar Nov 20 '23 16:11 danlaudk

Hey @danlaudk , I think this would require creating a new test that checks that an equivalent gated MLP in pytorch gets the same result up to torch.allclose.

0amp avatar Nov 21 '23 19:11 0amp