model_server
model_server copied to clipboard
[2025.4.1] Gpt-oss multi turn template + add to agentic demo
🛠 Summary
0.35 BFCL multiturn unary
Changes to original chat template:
- introducing reasoning=
none- this automaticly adds empty reasoning channel so it forces the model to skip the reasoning part. in reasoning effort placeholder it puts "low", since "none" was not present during gpt-oss training phase - removed exception when chat history contains both: content and reasoning_content. For some reason, during multi-turn bfcl benchmark i saw model generating both channels, which caused exception during history rendering. instead of exception, the
contentis rendered andreasoning_contentomitted (no idea why i just assumed content might be more insightful) - replaced
thinkingwithreasoning_content- for some reason openai used thinking field to render reasoning from history. replaced withreasoning_contentwhich is present in requests received during bfcl benchmark. - fixed issue with parsing empty
tool_calls: []array when rendering chat history. in some cases gpt-oss generates reasoning/content but no tool calls, bfcl sends empty array and chat template accessed index 0 assuming there always is some tool call. new chat templates ignores empty arrays now - removed
|tojsonfrom tool argument rendering. this introduced string escaping which causes escaping in output (chain reaction). openai harmony format assumes no escaping. this was related to both: function call output (result from mcp servers) and function call arguments (input to mcp servers) in chat history
Are you sure there are no functional downsides
No functional downsides visible in simple & multiple datasets
I'm not a fan of creating specific chat template for particular BFCL test category.
I'm not a fan either. But take a look at vllm chat templates: https://github.com/vllm-project/vllm/tree/main/examples - for example mistral. It has specific for parallel calling use case. However, this multiturn chat template should work for all scenarios, therefore I think I will remove original one from PR and just rename multiturn to regular. Are you ok with that? @mzegla