attention-viz
attention-viz copied to clipboard
Visualize value vectors too?
Could plot value vectors too for each attention head... would definitely add to computational load though.
- [x] Could try scaling attention weights by value norms too