DecisionTransformerInterpretability
DecisionTransformerInterpretability copied to clipboard
Reverse Logit Lense
https://www.lesswrong.com/posts/AcKRB8wDpdaN6v6ru/interpreting-gpt-the-logit-lens
https://colab.research.google.com/drive/1MjdfK2srcerLrAJDRaJQKO0sUiZ-hQtA?usp=sharing
pip install git+https://github.com/finetuneanon/transformers/@gpt-neo-localattention