outlines
outlines copied to clipboard
Add support for reasoning models
Expose 2 new keywords for generation:
end_thinking_tag: a string indicating the tag used by the reasoning model to indicate that thinking is finished (and so that we should start constraining the generation)thinking_max_tokens: an int giving the maximum number of tokens during which the model can think, after that number is reached, we force the generation of the end of thinking token
Not supported:
- Models for which the end of the thinking does not correspond to a single token
If we want to capture the content of the thinking in the future when we will return an object with various attributes instead of just the text output, we could add an argument start_thinking_tag for the models that use one.