Feature Request: Streaming Token Generation with Mid-Generation Tool Execution
Summary
Request for streaming token generation that allows pausing generation mid-stream to execute tools and append results before continuing. This would enable proper agentic tool-use patterns where models expect inline tool results.
Problem Statement
Current Behavior
In the current Tinker architecture, model generation is atomic:
# tinker_cookbook/rl/rollouts.py
ac_with_logprobs = await policy(ob, stop_condition) # Complete generation
step_result = await env.step(ac_with_logprobs.tokens) # Process AFTER generation
The SamplingClient.sample_async() returns only the final complete token sequence, not intermediate tokens.
The Issue
Models trained with tool-use (e.g., GPT-OSS, function-calling models) expect a specific interaction pattern:
Model: <analysis>Let me check the file</analysis>
Model: <tool_call>{"command": "cat file.txt"}</tool_call>
System: [Tool result appended inline] file contents here...
Model: <analysis>I see the file contains...</analysis>
Model: <tool_call>{"command": "echo 'fixed' > file.txt"}</tool_call>
System: [Tool result appended inline]
Model: <final_answer>Done!</final_answer>
But with atomic generation, we get:
Model: <analysis>Let me check the file</analysis>
Model: <tool_call>{"command": "cat file.txt"}</tool_call>
Model: [HALLUCINATED] The file probably contains... <-- Model guesses without seeing result
Model: <tool_call>{"command": "echo 'fixed' > file.txt"}</tool_call>
Model: [HALLUCINATED] Command executed successfully
Model: <final_answer>Done!</final_answer>
The model hallucinates tool results because it doesn't receive actual feedback inline.
I may be missing something, but I believe you can achieve this by just adding the tool-call end as a stop condition. You can do
sampling_client.sample(
sampling_params=tinker.SamplingParams(..., stop=<list of stop strings> or <list of stop tokens>))
)
to set the stop condition to anything.
Or, for the specific code you linked to in the cookbook, you can add the token corresponding to the tool-call end to the list of stop_condition tokens.
Closing due to inactivity!