vllm icon indicating copy to clipboard operation
vllm copied to clipboard

[Feature]: Support tool calls for DeepSeek.

Open tingjun-cs opened this issue 9 months ago • 29 comments

🚀 The feature, motivation and pitch

I saw from the official documentation (https://docs.vllm.ai/en/latest/features/tool_calling.html) that sglang supports tool calls, but I can't seem to find the tool parse for deepseekv3/r1. Does this mean that the deepseek model does not support tool calls?

From the DeepSeek official website, it seems that function call support has been implemented on the model side, although it may still be unstable. https://api-docs.deepseek.com/zh-cn/guides/function_calling

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

tingjun-cs avatar Mar 13 '25 09:03 tingjun-cs

DeepSeek R1 does not support tool calling. But QwQ-32B supports it. Ref https://github.com/vllm-project/vllm/issues/14490

gaocegege avatar Mar 13 '25 09:03 gaocegege

DeepSeek R1 does not support tool calling. But QwQ-32B supports it. Ref #14490

What about DeepSeek-V3? From the DeepSeek official website, it appears that DeepSeek-Chat (also known as DeepSeek-V3) already supports function calls, although it may still be unstable. https://api-docs.deepseek.com/zh-cn/guides/function_calling

tingjun-cs avatar Mar 13 '25 09:03 tingjun-cs

Yes, v3 supports it.

/cc @WangErXiao

gaocegege avatar Mar 13 '25 09:03 gaocegege

Yes, v3 supports it.

/cc @WangErXiao

So, what tool-call-parser should be used for DeepSeek-V3?

tingjun-cs avatar Mar 13 '25 09:03 tingjun-cs

I do not think we support it now.

gaocegege avatar Mar 13 '25 10:03 gaocegege

I do not think we support it now.

Is there a roadmap or schedule for supporting the tool call feature for DeepSeek-V3? If the official team supports this feature, it seems that it could be implemented?

tingjun-cs avatar Mar 13 '25 10:03 tingjun-cs

Yes, it is possible technically.

/cc @aarnphm Not sure if you are interested in this

gaocegege avatar Mar 13 '25 10:03 gaocegege

@tingjun-cs And, welcome contributions if you are interested in it 😄

gaocegege avatar Mar 13 '25 10:03 gaocegege

@tingjun-cs And, welcome contributions if you are interested in it 😄

Can you briefly introduce the implementation principles and ideas in this area? I'm currently just starting to get acquainted with this field.

tingjun-cs avatar Mar 13 '25 11:03 tingjun-cs

Probably want to work on supporting structural tag first, but can check if deepseek has their own function calling format.

You can try pythonic, hermes, or llama3_json

aarnphm avatar Mar 13 '25 12:03 aarnphm

I found DeepSeek V3 tool_calling requires additional handling of the logic within the messages, where the tool_calls information is placed inside the messages. @gaocegege @aarnphm @tingjun-cs Its chat template contains {%- for tool in message['tool_calls']%}

WangErXiao avatar Mar 13 '25 12:03 WangErXiao

It seems like they have a special format for this

<|tool▁calls▁begin|>

we can write a parser for this.

aarnphm avatar Mar 13 '25 23:03 aarnphm

Yes, it also need to handle logic of serving_chat.py. Currently tools info are in request.tools field. But DeepSeek V3 chat_template only handle tools info of message. @aarnphm

WangErXiao avatar Mar 14 '25 01:03 WangErXiao

Yes, it also need to handle logic of serving_chat.py. Currently tools info are in request.tools field. But DeepSeek V3 chat_template only handle tools info of message. @aarnphm

@WangErXiao @aarnphm In the chat-template for DeepSeek-V3, I did not find any logic in DeepSeek-V3 that handles tools, as is done in QWQ-32B. So, how does DeepSeek-V3 implement the parsing of natural language into function calls?

QWQ-32B:

{%- if tools %}
    {{- '<|im_start|>system\n' }}
    {%- if messages[0]['role'] == 'system' %}
        {{- messages[0]['content'] }}
    {%- else %}
        {{- '' }}
    {%- endif %}
    {{- "\n\n# Tools\n\nYou may call one or more functions to assist with the user query."
        "\n\nYou are provided with function signatures within <tools></tools> XML tags:\n"
        "<tools>"
    }}
    {%- for tool in tools %}
        {{- "\n" }}
        {{- tool | tojson }}
    {%- endfor %}
    {{- "\n</tools>\n\nFor each function call, return a json object with function name and "
        "arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n"
        "{\"name\": <function-name>, \"arguments\": <args-json-object>}\n"
        "</tool_call><|im_end|>\n"
    }}
{%- else %}
    {%- if messages[0]['role'] == 'system' %}
        {{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }}
    {%- endif %}
{%- endif %}

DeepSeek v3:

{% if not add_generation_prompt is defined %}
    {% set add_generation_prompt = false %}
{% endif %}

{% set ns = namespace(
    is_first=false,
    is_tool=false,
    is_output_first=true,
    system_prompt='',
    is_first_sp=true
) %}

{%- for message in messages %}
    {%- if message['role'] == 'system' %}
        {%- if ns.is_first_sp %}
            {% set ns.system_prompt = ns.system_prompt + message['content'] %}
            {% set ns.is_first_sp = false %}
        {%- else %}
            {% set ns.system_prompt = ns.system_prompt + '\n\n' + message['content'] %}
        {%- endif %}
    {%- endif %}
{%- endfor %}

{{ bos_token }}

{{ ns.system_prompt }}

{%- for message in messages %}
    {%- if message['role'] == 'user' %}
        {%- set ns.is_tool = false -%}
        {{ '' + message['content'] }}
    
    {%- elif message['role'] == 'assistant' and message['content'] is none %}
        {%- set ns.is_tool = false -%}
        {%- for tool in message['tool_calls'] %}
            {%- if not ns.is_first %}
                {{ '' + tool['type'] + '' + tool['function']['name'] 
                + '\n' + '```json' + '\n' 
                + tool['function']['arguments'] + '\n' + '```' + '' }}
                {%- set ns.is_first = true -%}
            {%- else %}
                {{ '\n' + tool['type'] + '' + tool['function']['name'] 
                + '\n' + '```json' + '\n' 
                + tool['function']['arguments'] + '\n' + '```' + '' }}
                {{ '' }}
            {%- endif %}
        {%- endfor %}
    
    {%- elif message['role'] == 'assistant' and message['content'] is not none %}
        {%- if ns.is_tool %}
            {{ '' + message['content'] + '' }}
            {%- set ns.is_tool = false -%}
        {%- else %}
            {{ '' + message['content'] + '' }}
        {%- endif %}
    
    {%- elif message['role'] == 'tool' %}
        {%- set ns.is_tool = true -%}
        {%- if ns.is_output_first %}
            {{ '' + message['content'] + '' }}
            {%- set ns.is_output_first = false %}
        {%- else %}
            {{ '\n' + message['content'] + '' }}
        {%- endif %}
    {%- endif %}
{%- endfor -%}

{% if ns.is_tool %}
    {{ '' }}
{% endif %}
{% if add_generation_prompt and not ns.is_tool %}
    {{ '' }}
{% endif %}

tingjun-cs avatar Mar 14 '25 09:03 tingjun-cs

Yes, it also need to handle logic of serving_chat.py. Currently tools info are in request.tools field. But DeepSeek V3 chat_template only handle tools info of message. @aarnphm

@WangErXiao @aarnphm From the DeepSeek official website, it seems that they also support tools being processed independently of messages. However, I don't quite understand how they achieve this because I couldn't find any logic related to tools in their chat_template.

from openai import OpenAI

def send_messages(messages):
    response = client.chat.completions.create(
        model="deepseek-chat",
        messages=messages,
        tools=tools
    )
    return response.choices[0].message

client = OpenAI(
    api_key="<your api key>",
    base_url="https://api.deepseek.com",
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather of an location, the user shoud supply a location first",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    }
                },
                "required": ["location"]
            },
        }
    },
]

messages = [{"role": "user", "content": "How's the weather in Hangzhou?"}]
message = send_messages(messages)
print(f"User>\t {messages[0]['content']}")

tool = message.tool_calls[0]
messages.append(message)

messages.append({"role": "tool", "tool_call_id": tool.id, "content": "24℃"})
message = send_messages(messages)
print(f"Model>\t {message.content}")

tingjun-cs avatar Mar 14 '25 09:03 tingjun-cs

Yes, in the DeepSeek-v3 chat template, tools is included within message, whereas OpenAI API places it in a separate field tools. However, I believe DeepSeek's API remains compatible with OpenAI's interface. So to achieve this feature, we need to either update the chat template logic of DeepSeek v3 or modify vLLM's OpenAI logic to store tools in separate message. @tingjun-cs

WangErXiao avatar Mar 14 '25 14:03 WangErXiao

both v3 and r1 uses the same chat template format for tools. from users perspective it should be the same as OpenAI interfaces.

from vLLM perspective, we just need to implement a tool parser for deepseek format (i can take a look over the weekend)

aarnphm avatar Mar 15 '25 03:03 aarnphm

both v3 and r1 uses the same chat template format for tools. from users perspective it should be the same as OpenAI interfaces.

from vLLM perspective, we just need to implement a tool parser for deepseek format (i can take a look over the weekend)

@WangErXiao @aarnphm However, I did not find any template processing related to "function" in DeepSeek's chat_template. So, how should the function template (including the function name, parameters, and description) be passed to the model?

example of function template:

"function": {
                "name": "get_weather",
                "description": "Get weather of an location, the user shoud supply a location first",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA",
                        }
                    },
                    "required": ["location"]
                },
            }

In DeepSeek's chat template, I only found two processing logics related to function calls:

  1. This logic seems to be calling the function body parsed by the model from the natural language:
    {%- elif message['role'] == 'assistant' and message['content'] is none %}
        {%- set ns.is_tool = false -%}
        {%- for tool in message['tool_calls'] %}
            {%- if not ns.is_first %}
                {{ '' + tool['type'] + '' + tool['function']['name'] 
                + '\n' + '```json' + '\n' 
                + tool['function']['arguments'] + '\n' + '```' + '' }}
                {%- set ns.is_first = true -%}
            {%- else %}
                {{ '\n' + tool['type'] + '' + tool['function']['name'] 
                + '\n' + '```json' + '\n' 
                + tool['function']['arguments'] + '\n' + '```' + '' }}
                {{ '' }}
            {%- endif %}
        {%- endfor %}
  1. This logic seems to be handling the result returned after the function call has been executed.
    {%- elif message['role'] == 'tool' %}
        {%- set ns.is_tool = true -%}
        {%- if ns.is_output_first %}
            {{ '' + message['content'] + '' }}
            {%- set ns.is_output_first = false %}
        {%- else %}
            {{ '\n' + message['content'] + '' }}
        {%- endif %}

However, I haven't found the logic that parses natural language into a function call. Shouldn't there be additional information required for this?

tingjun-cs avatar Mar 17 '25 09:03 tingjun-cs

I encountered the same issue. Did you solve it?

cchheennpsp avatar Mar 18 '25 12:03 cchheennpsp

According to the README.md of DeepSeek-V3-0324:

This model supports features such as function calling, JSON output, and FIM completion. For instructions on how to construct prompts to use these features, please refer to DeepSeek-V2.5 repo.

This is there tool call template :

# Assume that `model` and `tokenizer` are loaded
model.generation_config = GenerationConfig(do_sample=False, max_new_tokens=128, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.eos_token_id)

tool_system_prompt = """You are a helpful Assistant.

## Tools

### Function

You have the following functions available:

- `get_current_weather`:
```json
{
    "name": "get_current_weather",
    "description": "Get the current weather in a given location",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA"
            },
            "unit": {
                "type": "string",
                "enum": [
                    "celsius",
                    "fahrenheit"
                ]
            }
        },
        "required": [
            "location"
        ]
    }
}
```"""

tool_call_messages = [{"role": "system", "content": tool_system_prompt}, {"role": "user", "content": "What's the weather like in Tokyo and Paris?"}]
tool_call_inputs = tokenizer.apply_chat_template(tool_call_messages, add_generation_prompt=True, return_tensors="pt")
tool_call_outputs = model.generate(tool_call_inputs.to(model.device))
# Generated text: '<|tool▁calls▁begin|><|tool▁call▁begin|>function<|tool▁sep|>get_current_weather\n```json\n{"location": "Tokyo"}\n```<|tool▁call▁end|>\n<|tool▁call▁begin|>function<|tool▁sep|>get_current_weather\n```json\n{"location": "Paris"}\n```<|tool▁call▁end|><|tool▁calls▁end|><|end▁of▁sentence|>'

# Mock response of calling `get_current_weather`
tool_messages = [{"role": "tool", "content": '{"location": "Tokyo", "temperature": "10", "unit": null}'}, {"role": "tool", "content": '{"location": "Paris", "temperature": "22", "unit": null}'}]
tool_inputs = tokenizer.apply_chat_template(tool_messages, add_generation_prompt=False, return_tensors="pt")[:, 1:]
tool_inputs = torch.cat([tool_call_outputs, tool_inputs.to(model.device)], dim=1)
tool_outputs = model.generate(tool_inputs)
# Generated text: The current weather in Tokyo is 10 degrees, and in Paris, it is 22 degrees.<|end▁of▁sentence|>

huyz-git avatar Mar 26 '25 01:03 huyz-git

both v3 and r1 uses the same chat template format for tools. from users perspective it should be the same as OpenAI interfaces.

from vLLM perspective, we just need to implement a tool parser for deepseek format (i can take a look over the weekend)

Hello! I encountered the same problem, using vllm to deploy deepseek v3-0324 with parameters enable_auto_tool_choice=True,tool_call_parser='hermes', but when using the openai interface in the program, There will still be an error when the tools parameter is passed in. Has this problem been solved, or is there a relevant solution plan? Thank you very much.

liujinpen avatar Mar 28 '25 03:03 liujinpen

both v3 and r1 uses the same chat template format for tools. from users perspective it should be the same as OpenAI interfaces. from vLLM perspective, we just need to implement a tool parser for deepseek format (i can take a look over the weekend)

Hello! I encountered the same problem, using vllm to deploy deepseek v3-0324 with parameters enable_auto_tool_choice=True,tool_call_parser='hermes', but when using the openai interface in the program, There will still be an error when the tools parameter is passed in. Has this problem been solved, or is there a relevant solution plan? Thank you very much.

m

HandcakeBear avatar Mar 31 '25 12:03 HandcakeBear

m

HandcakeBear avatar Mar 31 '25 12:03 HandcakeBear

mark

ant-ob-hengtang avatar Apr 01 '25 06:04 ant-ob-hengtang

mark

WallE-Chang avatar Apr 07 '25 08:04 WallE-Chang

mark

elepherai avatar Apr 08 '25 00:04 elepherai

mark

sarihuangshanrong avatar Apr 10 '25 03:04 sarihuangshanrong

if you modify the template, the deepseekv3 would works. the model on the vllm response as below

{
    "id": "chatcmpl-d746bc93142f424ca8a556fc84f1ac23",
    "object": "chat.completion",
    "created": 1744357134,
    "model": "deepseek_v3",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "reasoning_content": null,
                "content": "<|tool▁calls▁begin|><|tool▁call▁begin|>function<|tool▁sep|>get_weather\n```json\n{\"location\":\"San Francisco, CA\",\"unit\":\"celsius\"}\n```<|tool▁call▁end|><|tool▁calls▁end|>",
                "tool_calls": [

                ]
            },
            "logprobs": null,
            "finish_reason": "stop",
            "stop_reason": null
        }
    ],
    "usage": {
        "prompt_tokens": 149,
        "total_tokens": 177,
        "completion_tokens": 28,
        "prompt_tokens_details": null
    },
    "prompt_logprobs": null
}

but there is no suitable parser to parse the result

xsank avatar Apr 11 '25 08:04 xsank

mark

li8696 avatar Apr 12 '25 02:04 li8696

Why the tool call arguments for Qwen/QwQ-32B is None instead of {} if no argument is needed to call the function

npuichigo avatar Apr 13 '25 07:04 npuichigo