verl icon indicating copy to clipboard operation
verl copied to clipboard

qwen2.5-vl chat_template Does Not Support Tool Call

Open Claude-Liu opened this issue 1 month ago • 2 comments

It appears that the qwen2.5-vl chat_template currently does not support tool calls. Because of this, the following script from VERL cannot work as expected:

Script: https://github.com/volcengine/verl/blob/main/examples/sglang_multiturn/geo3k/run_qwen2.5-3b_geo3k_multiturn.sh

This script relies on tool-call capability during chat generation, but the qwen2.5-vl template does not include the required tool schema formatting or tool-call tokens, resulting in failure during execution.

Claude-Liu avatar Nov 10 '25 14:11 Claude-Liu

The config it loads overrides default chat template with tool call

huaiyizhao avatar Nov 11 '25 07:11 huaiyizhao

class AgentLoopWorkerBase: """Agent loop worker takes a batch of messages and run each message in an agent loop."""

def __init__(
    self,
    config: DictConfig,
    server_handles: list[ray.actor.ActorHandle],
    reward_router_address: str = None,
):
    """Initialize agent loop manager.

    Args:
        config (DictConfig): YAML config.
        server_handles (List[ray.actor.ActorHandle]): OpenAI compatible LLM server actor handles.
    """
    self.config = config

    # for recipe to change
    if not hasattr(self, "server_manager"):
        self.server_manager = AsyncLLMServerManager(config, server_handles)

    self.reward_router_address = reward_router_address

    model_path = config.actor_rollout_ref.model.path
    self.model_name = "/".join(model_path.split("/")[-2:])
    local_path = copy_to_local(config.actor_rollout_ref.model.path)
    self.tokenizer = hf_tokenizer(local_path, trust_remote_code=True)
    self.processor = hf_processor(local_path, trust_remote_code=True)

the agent loop uses the default chat template of the LLM. Could you please explain how the chat template is overrided. ps: the sglang multiturn rollout can not support the rollout of qwen2.5vl, so we use agent loop instead.

Thanks a lot!

Claude-Liu avatar Nov 11 '25 08:11 Claude-Liu

CONFIG_PATH="$PROJECT_DIR/examples/sglang_multiturn/config"


python3 -m verl.trainer.main_ppo \
    --config-path="$CONFIG_PATH" \
    --config-name='geo3k_multiturn_grpo' \

It loads the geo3k_multiturn_grpo config, where the template are overrided

huaiyizhao avatar Nov 13 '25 09:11 huaiyizhao

I see. It is not a good question. Thanks for your patience!

Claude-Liu avatar Nov 13 '25 11:11 Claude-Liu