exllama
exllama copied to clipboard
Integrating with Guidance: adding a positive bias to certain tokens
Hi, thanks a lot for this project, really interesting!
I'm interested in trying to hook it up with the guidance library (https://github.com/microsoft/guidance). Before I attempt any coding though, would be helpful to have an opinion of the maintainers, since I'm not very knowledgeable in this area.
One pre-requisite is having a way to inject positive bias to certain tokens in the predictions.
See for instance these code snippets:
- https://github.com/microsoft/guidance/blob/e3c6fe93fa00cb86efc130bbce22aa29100936d4/guidance/llms/_transformers.py#L450
- https://github.com/microsoft/guidance/blob/e3c6fe93fa00cb86efc130bbce22aa29100936d4/guidance/library/_select.py#L96C1-L96C1
I've been looking at the generator code of ExLlama and found some similar logic in there, which adds a negative bias to constrained tokens, like in the gen_single_token method:
https://github.com/turboderp/exllama/blob/a01b25c884881871a0f75c96bbc582b6581665cb/generator.py#L344-L350
It seems to support guidance we might need to add a new generation function that just modifies this function slightly by also supporting a list of tokens that should receive a positive bias.
Am I understanding this correctly? Any thoughts? Appreciate any feedback, thanks!
OK, so I got a bit carried on and did a simple implementation in this PR: https://github.com/turboderp/exllama/pull/104
I was waiting for it / planning to do it myself but did not have enough time. So, thank you for your impatience :smile: .
@KaruroChori I guess this PR is just a PoC.
The full integration will be a bit harder on the guidance library side, so still plenty of work to do if you want 😀
I guess it will be a long weekend.
I opened an experimental PR on guidance library: https://github.com/microsoft/guidance/pull/298 There are some things there which I'm not quite happy with, but maybe someone is able to pick it up and improve it.