repeng issues

Cannot apply to other models

6

I successfully reproduce the notebook output for "mistralai/Mistral-7B-Instruct-v0.1". But when I change the model, I cannot get desired result with the same setting. Am I missing something? Or the model...

Starlento

truncated_output_suffixes &

1

``` with open("data/all_truncated_outputs.json") as f: output_suffixes = json.load(f) truncated_output_suffixes = [ tokenizer.convert_tokens_to_string(tokens[:i]) for tokens in (tokenizer.tokenize(s) for s in output_suffixes) for i in range(1, len(tokens)) ] truncated_output_suffixes_512 = [ tokenizer.convert_tokens_to_string(tokens[:i])...

thistleknot

vllm implementation

4

I'm trying to implement control vector into vllm codebase for mixtral model, but I was wondering where should I add the control vector to the layer. Should it be added...

raywanb

Alternatives to PCA, such as umap

13

There's a whole large body of work on dimensionality reduction which handles non linearity better - i.e. UMAP. https://umap-learn.readthedocs.io/en/latest/ Is it simple to just "drop" this in place of PCA...

Hellisotherpeople

Will there be support for models with custom architecture (not only mistral or gpt based)?

7

Nishant-kirito

Changed supervised PCA to use center of contrasting vectors

4

This is a fix in extract.py for #28 I also re-ran the experiments.ipynb notebook to see the outputs. As expected, using this method seems to create stronger vectors, such that...

r3ndd

Computing the difference vectors for PCA

In the original repeng paper they mostly described the unsupervised version of PCA, where they randomly paired the hidden state vectors and performed PCA on these pairs. This is fundamentally...

r3ndd

question: how would you go about saving a control vector for later use

2

Perhaps a naive question, but rather than training a control vector with each run. How might I go about saving it for inference later?

vpicone

Correct the positive and negative persona

1

In the `make_dataset`, the positive and negative persona looks like it's reverse.

hahuyhoang411

Numpy AttributeError on repeng import

4

I'm pretty new to interpretability libs, so this may be something obvious, but when I load a notebook (I've tried `experiments.ipynb` and `emotion.ipynb`) in a fresh Colab instance (whether CPU...

eggsyntax

repeng
repeng copied to clipboard

Metadata

Cannot apply to other models

truncated_output_suffixes &

vllm implementation

Alternatives to PCA, such as umap

Will there be support for models with custom architecture (not only mistral or gpt based)?

Changed supervised PCA to use center of contrasting vectors

Computing the difference vectors for PCA

question: how would you go about saving a control vector for later use

Correct the positive and negative persona

Numpy AttributeError on repeng import

← Metadata

Owner

Metadata

repeng repeng copied to clipboard

Metadata

← Metadata

Owner

Metadata

repeng
repeng copied to clipboard