model_server icon indicating copy to clipboard operation
model_server copied to clipboard

[2025.4.1] Gpt-oss multi turn template + add to agentic demo

Open dkalinowski opened this issue 2 weeks ago • 1 comments

🛠 Summary

0.35 BFCL multiturn unary

Changes to original chat template:

  • introducing reasoning=none - this automaticly adds empty reasoning channel so it forces the model to skip the reasoning part. in reasoning effort placeholder it puts "low", since "none" was not present during gpt-oss training phase
  • removed exception when chat history contains both: content and reasoning_content. For some reason, during multi-turn bfcl benchmark i saw model generating both channels, which caused exception during history rendering. instead of exception, the content is rendered and reasoning_content omitted (no idea why i just assumed content might be more insightful)
  • replaced thinking with reasoning_content - for some reason openai used thinking field to render reasoning from history. replaced with reasoning_content which is present in requests received during bfcl benchmark.
  • fixed issue with parsing empty tool_calls: [] array when rendering chat history. in some cases gpt-oss generates reasoning/content but no tool calls, bfcl sends empty array and chat template accessed index 0 assuming there always is some tool call. new chat templates ignores empty arrays now
  • removed |tojson from tool argument rendering. this introduced string escaping which causes escaping in output (chain reaction). openai harmony format assumes no escaping. this was related to both: function call output (result from mcp servers) and function call arguments (input to mcp servers) in chat history

dkalinowski avatar Dec 11 '25 10:12 dkalinowski

Are you sure there are no functional downsides

No functional downsides visible in simple & multiple datasets

I'm not a fan of creating specific chat template for particular BFCL test category.

I'm not a fan either. But take a look at vllm chat templates: https://github.com/vllm-project/vllm/tree/main/examples - for example mistral. It has specific for parallel calling use case. However, this multiturn chat template should work for all scenarios, therefore I think I will remove original one from PR and just rename multiturn to regular. Are you ok with that? @mzegla

dkalinowski avatar Dec 11 '25 11:12 dkalinowski