Easy-Transformer Add tests + better docs to ActivationCache

Add tests + better docs to ActivationCache

Open neelnanda-io opened this issue 2 years ago • 2 comments

Add tests that the methods in the ActivationCache class work correctly.

Go through the documentation and clarify things that are unclear (this is hard for me to do, so even just having someone new to the library flag confusions is helpful!)

Dec 19 '22 11:12 neelnanda-io

Is this already done? There are acceptance tests though they don't test some functions like get_full_resid_decomposition or compute_head_results.

May 30 '23 19:05 Felhof

Suggestion for better docs: The order in which things are concatenated together in the residual stack isn't always clear.

Example: https://github.com/neelnanda-io/TransformerLens/blob/829084a53836c5b8b388aa37a5ffce73b6371712/transformer_lens/ActivationCache.py#L1026-L1039

Specifically "... decomposition of the residual stream into embed, pos_embed, each head result, each neuron result, and the accumulated biases" seemed to imply that the activations stack will come in that order, while in reality, it is [*heads, *neurons, embed, pos_embed, baises].

I think this leads to rather subtle bugs (Granted that this is only my first time using the library): I accidentally unpacked a neuron activation as a head activation, mistakenly thinking this head was doing something really weird/interesting. The above also applies to stack_neuron_results.

Feb 24 '24 21:02 andylolu2

Easy-Transformer Easy-Transformer copied to clipboard

Add tests + better docs to ActivationCache

Easy-Transformer
Easy-Transformer copied to clipboard