activation-patching topic

List activation-patching repositories

pyvene

509
Stars
43
Forks
Watchers

Stanford NLP Python Library for Understanding and Improving PyTorch Models via Interventions