Aryaman Arora

Results 8 issues of Aryaman Arora

Tested on Mac OS X on the latest Chrome version on two machines. Right clicking an edge just brings up the normal right click menu for selecting text. Works on...

## Description Add saving/loading of trainable parameters in the model (e.g. classification heads) to `IntervenableModel.save()` and `IntervenableModel.load()`. Draft PR since some tests are failing, will finalise tomorrow. ## Testing Done...

### Suggestion / Feature Request https://github.com/AlignmentResearch/tuned-lens

enhancement

### Contact Details Come to my office on Gates 3rd floor anytime 👍 ### What happened? `IntervenableModel.save()` doesn't save trained model parameters. This is an issue when you are also...

bug

We should add support for training sparse autoencoders ([Bricken et al., 2023](https://transformer-circuits.pub/2023/monosemantic-features#setup-autoencoder-motivation), [Cunningham et al., 2023](https://arxiv.org/abs/2309.08600)). Cool be cool as a way of obtaining a feature basis for interventions.

Commonly, we want to exhaustively train DAS on every layer and position (or e.g. every attention head in a layer) to find which ones are causally relevant for the model's...

big model go brrrr

enhancement

From @Jemoka when trying to save/load a bert. ``` File ~/Documents/Projects/dropval/playground/dropval/trainers/reft.py:213, in ReFTrainer.load(self, path) 210 del model.config.__dict__["use_cache"] 211 model = model.train() --> 213 self.model = pyreft.ReftModel.load( 214 str(Path(path)/"intervention"), 215 model...