outlines
outlines copied to clipboard
Give the possibility to obtain the full response when calling the vLLM generate function
I'm using InspectAI to evaluate language models. In particular, I'm evaluating the benefits of structured text generation using Outlines with language models. I would like to obtain the full response when calling the vLLM generate function since InspectAI expects to get the full response. Would it be possible to give the possibility to the user to get the full response. The default should still be the same as now which is a filtered response.