Easy-Transformer icon indicating copy to clipboard operation
Easy-Transformer copied to clipboard

[Question] Generation not possible with hooks?

Open FergusFettes opened this issue 1 year ago • 1 comments

I'm trying to implement various methods from the patchscopes paper, and some of them utilize token generation to eg. explain the meaning of a patched representation.

I tried this and it kinda works. It seems if you run .generate() on a hooked model, the hook will apply during generation. However, then you come into the next difficulty, which is that after the first forward pass, the shape of the target_activations is [x, 1, x]-- I guess this is because the rest of the activations are cached, and the model only computes the new token on subsequent forward passes?

I was hoping perhaps that the hooks might get cleared out after the first pass of the generation, but that doesn't seem to be the case. This might be a quick fix for making this work.

FergusFettes avatar Jan 30 '24 14:01 FergusFettes

For future reference, I got multi-token generating working with nnsight and implemented patchscopes here: https://github.com/jcoombes/obvs/pull/22

FergusFettes avatar Feb 09 '24 15:02 FergusFettes