[Bug] tracer returns incorrect token_ids

Open Peng-YM opened this issue 1 month ago • 1 comments

When using verl + agent-lightning for multi-round TIR training, we noticed a serious bug in the vLLM return_token_id implementation. When using streaming output + Tool Parser, some "control tokens" did not return the correct token_id, which would cause severe mismatching during training and could lead to model crashes within a few steps. We have submitted a PR to vLLM to fix this issue.

https://github.com/vllm-project/vllm/pull/29074

I hope we can work together to advance this bug fix and merge it into vLLM as soon as possible.

Nov 21 '25 02:11 Peng-YM

Agent-lightning currently has a workaround to strip the streaming mode into non-streaming mode via LiteLLM proxy: #293

The LiteLLM telemetry also has some severe bugs relating to streaming. So vLLM is not the only framework who needs bug fixes to make this work.

Nov 21 '25 17:11 ultmaster