repeng icon indicating copy to clipboard operation
repeng copied to clipboard

A library for making RepE control vectors

Results 30 repeng issues
Sort by recently updated
recently updated
newest added

I successfully reproduce the notebook output for "mistralai/Mistral-7B-Instruct-v0.1". But when I change the model, I cannot get desired result with the same setting. Am I missing something? Or the model...

``` with open("data/all_truncated_outputs.json") as f: output_suffixes = json.load(f) truncated_output_suffixes = [ tokenizer.convert_tokens_to_string(tokens[:i]) for tokens in (tokenizer.tokenize(s) for s in output_suffixes) for i in range(1, len(tokens)) ] truncated_output_suffixes_512 = [ tokenizer.convert_tokens_to_string(tokens[:i])...

I'm trying to implement control vector into vllm codebase for mixtral model, but I was wondering where should I add the control vector to the layer. Should it be added...

There's a whole large body of work on dimensionality reduction which handles non linearity better - i.e. UMAP. https://umap-learn.readthedocs.io/en/latest/ Is it simple to just "drop" this in place of PCA...

This is a fix in extract.py for #28 I also re-ran the experiments.ipynb notebook to see the outputs. As expected, using this method seems to create stronger vectors, such that...

In the original repeng paper they mostly described the unsupervised version of PCA, where they randomly paired the hidden state vectors and performed PCA on these pairs. This is fundamentally...

Perhaps a naive question, but rather than training a control vector with each run. How might I go about saving it for inference later?

In the `make_dataset`, the positive and negative persona looks like it's reverse.

I'm pretty new to interpretability libs, so this may be something obvious, but when I load a notebook (I've tried `experiments.ipynb` and `emotion.ipynb`) in a fresh Colab instance (whether CPU...