Easy-Transformer icon indicating copy to clipboard operation
Easy-Transformer copied to clipboard

[Bug Report] Attention masking is not used by model forward methods

Open jmsdao opened this issue 2 years ago • 1 comments

image

jmsdao avatar Dec 20 '23 01:12 jmsdao

IIUC the attention_mask is overwritten in the code if you don't set start_at_layer argument:

https://github.com/neelnanda-io/TransformerLens/blob/main/transformer_lens/HookedTransformer.py#L535-L546

this is also mentioned in the docstring: https://github.com/neelnanda-io/TransformerLens/blob/main/transformer_lens/HookedTransformer.py#L511-L513

since it infers the pad attention mask from the tokens itself. In your case you dont have pad tokens, so the inferred attention mask has no effect.

uralik avatar Feb 09 '24 17:02 uralik