nix-apollo
nix-apollo
I think I found at least one part of what is going wrong. Have a look at the attention scores of head 5.0 on this example: ``` from transformer_lens import...
An easy test to run: add a hook in the attention pattern calculation for pythia that centers the attention scores for every query position (before the mask). As softmax is...
Just read through the notebook. For what it's worth, I strongly dislike the design choice of having a persistent state about which SAEs are turned on or off in forward...
To me `HookedSAETransformer.attach_sae` seems more analogous to `add_hook`. I like having this version of statefulness! Having an object that represents "GPT2 with this set of SAEs attached" seems useful to...