Ilia Kulikov

Results 5 comments of Ilia Kulikov

@theSage21, I guess right now the best thing is to use TorchAgent as a base class, check seq2seq agent for example

IIUC the attention_mask is overwritten in the code if you don't set start_at_layer argument: https://github.com/neelnanda-io/TransformerLens/blob/main/transformer_lens/HookedTransformer.py#L535-L546 this is also mentioned in the docstring: https://github.com/neelnanda-io/TransformerLens/blob/main/transformer_lens/HookedTransformer.py#L511-L513 since it infers the pad attention mask...

@elisonlau please consider checking the official torchaudio backend for wav2vec2 based models, IIRC it well supports the ckpts from fairseq there as you can see in this tutorial: https://pytorch.org/audio/stable/tutorials/speech_recognition_pipeline_tutorial.html I...

Example: ![image](https://github.com/facebookresearch/fairseq2/assets/2152005/eda952f8-1f3b-459a-8428-60de012efd4a)