lorax
lorax copied to clipboard
Add support for control vector adapters per request
trafficstars
Good thread on it here:
https://www.reddit.com/r/LocalLLaMA/comments/1bgej75/control_vectors_added_to_llamacpp/
Given how parameter efficient control vectors are, they're a perfect candidate for something like LoRAX where you might want to serve many different such vectors on a single base model. A very natural place to start could be creating a segmented gather matrix vector addition kernel for this (as the control vector is simply additive to the base model hidden state).
https://vgel.me/posts/representation-engineering/