deer-flow icon indicating copy to clipboard operation
deer-flow copied to clipboard

fix: fallback to strict json_schema if planner response validation fails in json_mode

Open taxman20000 opened this issue 6 months ago • 6 comments

Improved fix:

This fix improves upon the reverted PR https://github.com/bytedance/deer-flow/pull/322 because the fix in that PR was incompatible with some LLMs. In this improved PR, the planner tries to generate a plan using json_mode first and then using a strict json_schema if the first try fails.

That means the code retains the existing json_mode when the planner initially tries to generate a Plan response. Only if that initial response does not comply with the required response format and the pydantic model validation throws an OutputParserException, then the planner tries to generate another plan using a strict json_schema to force the LLM response to strictly comply with the pydantic Plan model. If that try also causes an exception then the user is instructed to reduce the complexity of their query.

Problem:

The planner node expects an LLM response in a specific json format, but the LLM response does not always comply with that format so pydantic raises a model validation error, as reported by users in these issues: https://github.com/bytedance/deer-flow/issues/151 https://github.com/bytedance/deer-flow/issues/189 https://github.com/bytedance/deer-flow/issues/191 and possibly here: https://github.com/bytedance/deer-flow/issues/99 https://github.com/bytedance/deer-flow/issues/217

taxman20000 avatar Jun 17 '25 22:06 taxman20000

Hi @WillemJiang I would appreciate if you could review this improved fix and test the models that failed with https://github.com/bytedance/deer-flow/pull/322

taxman20000 avatar Jun 17 '25 22:06 taxman20000

Hi @WillemJiang I would appreciate if you could review this improved fix and test the models that failed with #322

@taxman20000 I think it could be more easy to chose the json_schema model the configuration. In this way the user has the full of the control.

WillemJiang avatar Jun 18 '25 12:06 WillemJiang

I think it could be more easy to chose the json_schema model the configuration. In this way the user has the full of the control.

@WillemJiang Thank you for your comment.

Would you like the configuration to be set in conf.yaml? If so, I would propose to have 3 settings for the method:

  • auto (default)
  • json_mode
  • strict json_schema

The auto setting would first try json_mode and if that fails then try strict json_schema (similar to the code currently implemented in this PR). The other two options would exclusivly try json_mode or strict json_schema.

I think having auto be the default option for the method would be easiest for users because it will work in most cases. Users should not have to know what the difference between json_schema and json_mode is to successfully use deer-flow. Advanced users could set the method if they wish to do so.

I can update this PR if you agree with this proposal.

taxman20000 avatar Jun 18 '25 13:06 taxman20000

This PR has not been merged? Is there any other solution?

xuanyinmu avatar Jun 23 '25 03:06 xuanyinmu

Hi @WillemJiang please let me know if you have any further directions on how you would like me to modify this PR to get the fix merged.

taxman20000 avatar Jun 25 '25 14:06 taxman20000

I think it could be more easy to chose the json_schema model the configuration. In this way the user has the full of the control.

@WillemJiang Thank you for your comment.

Would you like the configuration to be set in conf.yaml? If so, I would propose to have 3 settings for the method:

  • auto (default)
  • json_mode
  • strict json_schema

The auto setting would first try json_mode and if that fails then try strict json_schema (similar to the code currently implemented in this PR). The other two options would exclusivly try json_mode or strict json_schema.

We can add info log for the user to set to json_schema mode if it works, it can save some token usage for the user. We aslo need to update the document for advance user to configure it.

I think having auto be the default option for the method would be easiest for users because it will work in most cases. Users should not have to know what the difference between json_schema and json_mode is to successfully use deer-flow. Advanced users could set the method if they wish to do so.

I can update this PR if you agree with this proposal.

Thanks for looking into it and working on this proposal.

WillemJiang avatar Jun 27 '25 01:06 WillemJiang