vllm icon indicating copy to clipboard operation
vllm copied to clipboard

[v0][structured output] Support reasoning output

Open gaocegege opened this issue 9 months ago • 2 comments

Ref https://github.com/vllm-project/vllm/issues/12619

Only apply Guided/Structured grammar after reasoning steps in Reasoning models. It enabled reasoning outputs for xgrammar and outlines, didn't support lm-format-enforcer.

Usage:

vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --enable-reasoning --reasoning-parser deepseek_r1
prompt = ("Generate a JSON with the brand, model and car_type of"
          "the most iconic car from the 90's, think in 100 tokens")
completion = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B",
    messages=[{
        "role": "user",
        "content": prompt,
    }],
    extra_body={"guided_json": json_schema},
)
print("content: ", completion.choices[0].message.content)
print("reasoning_content: ", completion.choices[0].message.reasoning_content)
content: {

"brand": "Levels",
"model": "racing equation",
"car_type": "sedan"
}
reasoning_content:  
Okay, the user is asking me to generate a JSON with the brand, model, and car type of the most iconic 90s car. I need to make sure it's within 100 tokens. Let me start by recalling some of the most significant cars from the 90s.型号 race cars are a big category, so theathy and V8 are definitely in there.马路超车 automotive should be in as well.

I need to find the brand.Levels was really influential, especially in racing. So the brand should be Levels. The model is the race car itself, maybe racing equations since they were a big trend. The engine details: a 7.5-litreUnused to make sense of the information, but I'll have to include it as it's important for the context.

Fact checking is crucial. The 90s included the Formula 1 series, which was growing. Yes, L charcoal was a key driver there. Values of engines and tests set a benchmark. B gjvertern is part of the马路 coder project, and P race appears to be another race series.

I should structure the JSON without parentheticals and make sure each field is clearly defined. Let me make a mental note to keep it concise. I'll write the JSON, making sure it neither uses more than 90 tokens nor exceeds that number. Finally, I'll double-check to ensure accuracy without making any mistakes.

gaocegege avatar Feb 08 '25 09:02 gaocegege

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

github-actions[bot] avatar Feb 08 '25 09:02 github-actions[bot]

@russellb Could you please help review this PR?

gaocegege avatar Feb 19 '25 01:02 gaocegege

Please let me know if anyone is ready to review and merge this. Once confirmed, I will resolve the conflicts here. It’s time-consuming to address them daily.

gaocegege avatar Feb 23 '25 01:02 gaocegege

This pull request has merge conflicts that must be resolved before it can be merged. Please rebase the PR, @gaocegege.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mergify[bot] avatar Feb 25 '25 14:02 mergify[bot]

Comments are addressed. @aarnphm PTAL

gaocegege avatar Feb 26 '25 01:02 gaocegege

Thanks for your review @aarnphm @mgoin

gaocegege avatar Mar 03 '25 02:03 gaocegege