Easy-Transformer icon indicating copy to clipboard operation
Easy-Transformer copied to clipboard

Add tests + better docs to ActivationCache

Open neelnanda-io opened this issue 2 years ago • 2 comments

Add tests that the methods in the ActivationCache class work correctly.

Go through the documentation and clarify things that are unclear (this is hard for me to do, so even just having someone new to the library flag confusions is helpful!)

neelnanda-io avatar Dec 19 '22 11:12 neelnanda-io

Is this already done? There are acceptance tests though they don't test some functions like get_full_resid_decomposition or compute_head_results.

Felhof avatar May 30 '23 19:05 Felhof

Suggestion for better docs: The order in which things are concatenated together in the residual stack isn't always clear.

Example: https://github.com/neelnanda-io/TransformerLens/blob/829084a53836c5b8b388aa37a5ffce73b6371712/transformer_lens/ActivationCache.py#L1026-L1039

Specifically "... decomposition of the residual stream into embed, pos_embed, each head result, each neuron result, and the accumulated biases" seemed to imply that the activations stack will come in that order, while in reality, it is [*heads, *neurons, embed, pos_embed, baises].

I think this leads to rather subtle bugs (Granted that this is only my first time using the library): I accidentally unpacked a neuron activation as a head activation, mistakenly thinking this head was doing something really weird/interesting. The above also applies to stack_neuron_results.

andylolu2 avatar Feb 24 '24 21:02 andylolu2