vllm [Feature]: Control vectors

[Feature]: Control vectors

Open generalsvr opened this issue 3 months ago • 6 comments

🚀 The feature, motivation and pitch

Add support for control vectors

See https://github.com/vgel/repeng and https://github.com/ggerganov/llama.cpp/pull/5970

Alternatives

No response

Additional context

No response

Mar 17 '24 01:03 generalsvr

@simon-mo @generalsvr I should be able to help with this. Let me know how to start.

For more context about control vectors: Representation Engineering: A Top-Down Approach to AI Transparency

Apr 13 '24 00:04 justinphan3110

We can achieve this by loading the control vectors when initializing the cache engine and apply the change to forward() of specified QKVLinear layers, but such changes will be added for all models and all kinds of linear method, which introduce extra complexity to the codebase. Do you have any hints on how we can abstract such logic and make the integration clear? @simon-mo

Apr 15 '24 20:04 Kaiyang-Chen

Something additional to consider is specifying different control vectors (and coefficients) per request which then get stacked into a control matrix with one dimension equal to the batch size.

This can be useful when serving users that require different styles of responses at the same time.

Not sure about the impact on latency.

Apr 24 '24 22:04 sapountzis

currently working on an implementation by wrapping the decoder layer and changing the forward pass. lmk if you wanna collaborate on this

Apr 25 '24 20:04 raywanb

@raywanb somethingworth looking into would be also the technique presented here, which might be superior in some regards:

https://www.lesswrong.com/posts/jGuXSZgv6qfdhMCuJ/refusal-in-llms-is-mediated-by-a-single-direction

It comes with a nice colab as well: https://colab.research.google.com/drive/1a-aQvKC9avdZpdyBn4jgRQFObTPy1JZw?usp=sharing&authuser=1

There's a discussion in the comments with the authors of the Represenation Engineering paper.

Apr 28 '24 17:04 DreamGenX

@raywanb somethingworth looking into would be also the technique presented here, which might be superior in some regards:

https://www.lesswrong.com/posts/jGuXSZgv6qfdhMCuJ/refusal-in-llms-is-mediated-by-a-single-direction

It cames with a nice colab as well: https://colab.research.google.com/drive/1a-aQvKC9avdZpdyBn4jgRQFObTPy1JZw?usp=sharing&authuser=1

There's a discussion in the comments with the authors of the Represenation Engineering paper.

It seems that the colab link doesn't work.

Apr 29 '24 16:04 heraclex12

vllm vllm copied to clipboard

[Feature]: Control vectors

🚀 The feature, motivation and pitch

Alternatives

Additional context

vllm
vllm copied to clipboard