dspy [Bug] `dspy.ReAct` doesn't format/use `dspy.History` `OutputField`s

What happened?

When using dspy.ReAct and passing dspy.History as a InputField, none of the included OutputFields are used in the LLM chain.

Steps to reproduce

from dspy import ReAct, History, Signature, InputField, OutputField

class GenerateAnswer(Signature):
    question: str = InputField(desc="The question to answer")
    history: History = InputField(desc="The conversation history")

    answer: str = OutputField(desc="The answer to the question")

react = ReAct(GenerateAnswer, tools=[])
history = History(messages=[{"question": "What's the capital of Germany?", "answer": "The capital of Germany is Berlin"}])

react.forward(question="What is the capital of France?", history=history)

Checking the LLM call chain reveals that the user portion is correctly passed, but the assistant portion is not.

lm.history

[
    {'role': 'system', 'content': '...'}, # omitted for brevity
    {'role': 'user', 'content': "[[ ## question ## ]]\nWhat's the capital of Germany?"}, # InputField correctly passed
    {'role': 'assistant', 'content': '[[ ## next_thought ## ]]\nNone\n\n[[ ## next_tool_name ## ]]\nNone\n\n[[ ## next_tool_args ## ]]\nNone'}, # OutputField not passed
    {'role': 'user', 'content': """[[ ## question ## ]]\nWhat is the capital of France?\n\n[[ ## trajectory ## ]]\n\n\nRespond with the correspondi
ng output fields, starting with the field `[[ ## next_thought ## ]]`, then `[[ ## next_tool_name ## ]]` (must be formatted as a 
valid Python Literal['finish']), then `[[ ## next_tool_args ## ]]` (must be formatted as a valid Python dict[str, Any]), and the
n ending with the marker for `[[ ## completed ## ]]`."""}
]

May 08 '25 00:05 nickthegroot

@nickthegroot Thanks for reporting the issue, I will take a look

May 08 '25 17:05 chenmoneygithub

@nickthegroot

This is not a bug, but a feature. For ReAct, the question is not complicated enough to call a tool to finish the task.

In the call, it decides that the next_tool_name is finish and the question can be answered without additional information.

[[ ## next_tool_name ## ]]` (must be formatted as a 
valid Python Literal['finish'])

The Output field, answer, is indeed included in the second call, which answers the question directly without calling additional tools.

# The first call to determine the next tool 
[{'prompt': None,
  'messages': [{'role': 'system',
    ....
   {'role': 'user',
    'content': "[[ ## question ## ]]\nWhat's the capital of Germany?"},
   {'role': 'assistant',
    'content': '[[ ## next_thought ## ]]\nNone\n\n[[ ## next_tool_name ## ]]\nNone\n\n[[ ## next_tool_args ## ]]\nNone'},
   {'role': 'user',
    'content': "[[ ## question ## ]]\nWhat is the capital of France?\n\n[[ ## trajectory ## ]]\n\n\nRespond with the corresponding output fields, starting with the field `[[ ## next_thought ## ]]`, then `[[ ## next_tool_name ## ]]` (must be formatted as a valid Python Literal['finish']), then `[[ ## next_tool_args ## ]]` (must be formatted as a valid Python dict[str, Any]), and then ending with the marker for `[[ ## completed ## ]]`."}],
  'outputs': ['[[ ## next_thought ## ]]\nI know that the capital of France is Paris. I can finalize my response now.\n\n[[ ## next_tool_name ## ]]\nfinish\n\n[[ ## next_tool_args ## ]]\n{}\n\n[[ ## completed ## ]]'],
    ...
    },

# The second call to answer the question with `finish` tool.  
  {'prompt': None,
  'messages': [{'role': 'system',
    .... 
   {'role': 'user',
    'content': "[[ ## question ## ]]\nWhat's the capital of Germany?"},
   {'role': 'assistant',
    'content': '[[ ## reasoning ## ]]\nNone\n\n[[ ## answer ## ]]\nThe capital of Germany is Berlin'},
   {'role': 'user',
    'content': '[[ ## question ## ]]\nWhat is the capital of France?\n\n[[ ## trajectory ## ]]\n[[ ## thought_0 ## ]]\nI know that the capital of France is Paris. I can finalize my response now.\n\n[[ ## tool_name_0 ## ]]\nfinish\n\n[[ ## tool_args_0 ## ]]\n{}\n\n[[ ## observation_0 ## ]]\nCompleted.\n\nRespond with the corresponding output fields, starting with the field `[[ ## reasoning ## ]]`, then `[[ ## answer ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`.'}],
  'outputs': ['[[ ## reasoning ## ]]\nThe capital of France is widely known to be Paris, which is a major European city and a global center for art, fashion, and culture.\n\n[[ ## answer ## ]]\nThe capital of France is Paris.\n\n[[ ## completed ## ]]'],
  ... 
}]

cc @chenmoneygithub to keep me honest.

Jun 10 '25 17:06 Hangzhi

@Hangzhi Your understanding of ReAct is correct, but I think the user is asking about why answer doesn't appear in the conversation history although provided through the history field.

@nickthegroot Sorry for the late reply, but this is actually expected. ReAct is essentially a multi-stage program, with the first program only determines the tool to call, which means its output list doesn't have a field answer, so although history has answer field, it won't reflect on the tool calling module. However, if you look at the history of the chainOfThought module inside ReAct, it contains the answer output in the conversation history.

Jun 11 '25 01:06 chenmoneygithub

I find this to be confusing, even after reading the explanations provided. Am I incorrect or does this make it harder or sometimes impossible to ask follow-up questions to a ReAct agent cohesively? If I want to reference the LLM's answer in another question, it is left without context to call additional tools if needed.

Jul 09 '25 13:07 BenMcH

I find this to be confusing, even after reading the explanations provided. Am I incorrect or does this make it harder or sometimes impossible to ask follow-up questions to a ReAct agent cohesively? If I want to reference the LLM's answer in another question, it is left without context to call additional tools if needed.

I'm facing the same issue, thinking how I can handle follow up questions to my ReAct agent

Jul 09 '25 13:07 lambda-science

Sorry for the late reply, but this is actually expected. ReAct is essentially a multi-stage program, with the first program only determines the tool to call, which means its output list doesn't have a field answer, so although history has answer field, it won't reflect on the tool calling module. However, if you look at the history of the chainOfThought module inside ReAct, it contains the answer output in the conversation history.

It took me a few read-throughs to understand this message, so I'm adding this here both for the benefit of others as well as my future self. This is actually not a bug, just unexpected behavior on my part.

This code houses all of the pain I felt 😄

        react_signature = (
            dspy.Signature({**signature.input_fields}, "\n".join(instr))
            .append("trajectory", dspy.InputField(), type_=str)
            .append("next_thought", dspy.OutputField(), type_=str)
            .append("next_tool_name", dspy.OutputField(), type_=Literal[tuple(tools.keys())])
            .append("next_tool_args", dspy.OutputField(), type_=dict[str, Any])

Note, the signature passed in does not have its outputs reflected on the tool calling signature, as Chen mentioned (I overlooked this).

In exploring the code, I don't know how I would change this either. Adding all non-input fields to the output could work, but would likely cause issues in some existing DSPy programs out in the wild.

A likely better solution, which will live in my application code instead, is to provide an optional list of memories or contexts to the input side of my signatures and perform "context engineering" instead of relying on chat histories entirely. They're still chat histories, but more intentional, IMO.

Jul 10 '25 17:07 BenMcH