exllama icon indicating copy to clipboard operation
exllama copied to clipboard

Feature Request: length_penalty support

Open Qubitium opened this issue 2 years ago • 3 comments

We are trying to port the transformer based gen code to exllama but did not find a configurable length_penalty control. Will this be on the road map? Thanks.

Qubitium avatar Jun 13 '23 05:06 Qubitium

Could you elaborate? There are various more-or-less hacky ways to force shorter or longer replies from a language model, but no standard way of doing it. Is there a particular front-end or UI you're referring to?

turboderp avatar Jun 13 '23 12:06 turboderp

After looking at the transformer length_penalty doc it is actually beam_alpha. So, it is only applicable to multiple beams.

https://github.com/huggingface/transformers/issues/16930

Qubitium avatar Jun 13 '23 13:06 Qubitium

being able to bias towards shorter or longer responses would be a great addition

vpassanisi avatar Jul 02 '23 22:07 vpassanisi