ai icon indicating copy to clipboard operation
ai copied to clipboard

finish_reason is stop instead of tool_calls for Cloudflare AI Models, Breaking Automation Workflows

Open ankitdalalx opened this issue 5 months ago • 2 comments

When using Cloudflare AI models that support tool/function calling, the API incorrectly returns a finish_reason of stop instead of tool_calls when the model decides to use a tool. The response contains the intended tool call in the message content, but because the finish_reason is not set correctly, standard OpenAI-compliant SDKs and automation platforms like n8n do not proceed to execute the tool. This effectively stops the workflow and breaks any automation that relies on tool usage.

This appears to be a departure from the expected behavior for OpenAI-compliant APIs and prevents the successful integration of Cloudflare AI models into tool-using applications.

Expected Behavior

When an AI model is called with a set of tools and decides to use one, the API response should have a finish_reason of tool_calls. The response message should also contain a tool_calls object detailing the function and arguments to be executed. This allows the client-side code (e.g., an n8n workflow or a LangChain agent) to execute the specified tool and then send the result back to the model to continue the process.

Actual Behavior

The Cloudflare AI API returns a finish_reason of stop. The content of the response message contains the tool call information, but there is no tool_calls object in the response. The workflow or SDK, seeing the stop reason, terminates the execution flow instead of calling the tool.

For example, the response chunk with the tool call has a finish_reason of stop:

JSON

{
  "role": "assistant",
  "content": null,
  "tool_calls": [
    {
      "id": "call_...",
      "type": "function",
      "function": {
        "name": "calculator",
        "arguments": "{\"input\":\"4+4\"}"
      }
    }
  ]
}

This is incorrect and should be tool_calls to signal the next step to the client.

Steps to Reproduce

Configure an application or workflow (e.g., in n8n, or using a Python script with an OpenAI SDK) to use a Cloudflare AI model that supports function calling. Provide the model with a tool, such as a simple calculator. Send a prompt that requires the use of the tool (e.g., "What is 4 plus 4?"). Observe the raw response from the Cloudflare AI API. Note that the finish_reason in the response is stop, and the tool call information is present in the message, but the workflow does not proceed to execute the tool. Impact on Workflows 😥

This bug significantly impacts the usability of Cloudflare AI models in any automated workflow that requires tools. For platforms like n8n and for developers using standard OpenAI SDKs, the incorrect finish_reason breaks the entire "ReAct" (Reason and Act) loop that is fundamental to how AI agents operate with tools. This forces developers to manually parse the content of the response to check for tool calls, which is an unreliable and non-standard workaround.

Affected Models (Observed)

This behavior has been observed with the following Cloudflare models:

@cf/meta/llama-3.1-8b-instruct
@cf/meta/llama-3-8b-instruct-fp16
@cf/meta/llama-4-scout-17b-16e-instruct

ankitdalalx avatar Jun 07 '25 17:06 ankitdalalx

can you make a repro please?

cc @G4brym

threepointone avatar Jun 11 '25 11:06 threepointone

Checking the code real quick, do you think these are the possible bugs? @ankitdalalx

  • https://github.com/cloudflare/ai/blob/main/packages/workers-ai-provider/src/workersai-chat-language-model.ts#L177
  • https://github.com/cloudflare/ai/blob/main/packages/workers-ai-provider/src/workersai-chat-language-model.ts#L222

I think we need to start using this function, I'm not sure if it's ready or not

  • https://github.com/cloudflare/ai/blob/main/packages/workers-ai-provider/src/map-workersai-finish-reason.ts

JoaquinGimenez1 avatar Jun 16 '25 20:06 JoaquinGimenez1

@threepointone @JoaquinGimenez1 @G4brym

Hey First of All Thanks For The reply

So I don't know the exact code causing it but to some extent this function may help

this one

https://github.com/cloudflare/ai/blob/main/packages/workers-ai-provider/src/map-workersai-finish-reason.ts

Here is Basic Ideas How it works in other ai systems

Image


Flowchart of AI Tool Interaction

+------------------+
|   User's Prompt  |
| "Calculate X"    |
+------------------+
         |
         v
+------------------+       +-------------------------+
|   Your System    |-----> |        AI Model         |
| (Sends API Call) |       | (Sees prompt + tools)   |
+------------------+       +-------------------------+
                                      |
                                      v
                             +--------------------+
                             |  AI's Thought Process: |
                             | "I need to calculate. |
                             |  The 'calculator'    |
                             |  tool can do this."  |
                             +--------------------+
                                      |
                                      v
+------------------+       +-------------------------+
|   Your System    | <-----|   AI Model Response     |
| (Receives        |       | (Generates a "Tool Call")|
|  instruction)    |       +-------------------------+
+------------------+
         |
         v
+--------------------------------+
|  Your System's Action:         |
|  1. Parses the Tool Call       |
|  2. Executes the "calculator"  |
|     function with the input.   |
|  3. Gets the result: "28"      |
+--------------------------------+
         |
         v
+------------------+       +-------------------------+
|   Your System    |-----> |        AI Model         |
| (Sends result    |       | (Gets conversation      |
|  back to AI)     |       |  history + tool result) |
+------------------+       +-------------------------+
                                      |
                                      v
+------------------+       +-------------------------+
|   Your System    | <-----|    Final AI Response    |
| (Receives final  |       | (Summarizes the result  |
|  answer)         |       |  in natural language)   |
+------------------+       +-------------------------+
         |
         v
+------------------+
|   Show to User   |
| "The result is 28"|
+------------------+

Step 1: User Request to AI

You start by sending an API request. This request contains two critical pieces of information:

  1. messages: The user's question.
  2. tools: A list of functions the AI is allowed to ask you to run. You describe what each tool does and what parameters it needs.

The AI uses the description to decide when to use the tool.

JSON INPUT (Call 1)

{
  "model": "deepseek/deepseek-chat-v3-0324:free",
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "calculator",
        "description": "Useful for getting the result of a math expression. The input to this tool should be a valid mathematical expression that could be executed by a simple calculator.",
        "parameters": {
          "type": "object",
          "properties": {
            "input": {
              "type": "string"
            }
          },
          "required": [
            "input"
          ]
        }
      }
    }
  ],
  "messages": [
    {
      "role": "user",
      "content": "use calculator tool for 4+4+4%120*5"
    }
  ]
}

Step 2: AI Decides to Use a Tool and Responds with Instructions

The AI analyzes the user's request ("use calculator tool for 4+4+4%120*5") and looks at the available tools. It sees that the calculator tool matches the user's intent.

Crucially, the AI does not calculate 4+4+4%120*5. Instead, its response is an instruction for your system to call the tool. It generates a tool_calls object.

The response you get looks like this. The streamed_data shows how the JSON is built piece-by-piece, but the final, important part is the assembled tool_calls object.

JSON OUTPUT (from AI)

This is the AI telling your code: "Stop. Pause our conversation. Run the function named calculator with these arguments."

{
  "id": "gen-1750399984-PdJmxGs8MlGE17VrLuGq",
  "model": "deepseek/deepseek-chat-v3-0324:free",
  "choices": [
    {
      "index": 0,
      "finish_reason": "tool_calls",
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_XvyS1nuzRDisaspV7oAX0Q",
            "type": "function",
            "function": {
              "name": "calculator",
              "arguments": "{\"input\":\"4+4+4%120*5\"}"
            }
          }
        ]
      }
    }
  ]
  // ... other metadata
}
  • finish_reason: tool_calls tells you the AI has paused to wait for a tool result.
  • tool_calls.id: A unique ID for this specific call. You'll need it in the next step.
  • tool_calls.function.name: The name of the function your code should execute.
  • tool_calls.function.arguments: The JSON string of arguments to pass to your function.

Step 3: Your System Executes the Tool

Now, your application code takes over.

  1. You parse the tool_calls array from the AI's response.
  2. You see it wants to run calculator.
  3. You extract the arguments and call your actual calculator function.
    // Example in JavaScript
    const toolCall = aiResponse.choices[0].message.tool_calls[0];
    if (toolCall.function.name === "calculator") {
      const args = JSON.parse(toolCall.function.arguments);
      // Your actual function that does the math
      const result = myCalculatorFunction(args.input); // This function calculates "4+4+4%120*5" and returns "28"
      // Now you have the result: "28"
      // And the tool_call_id: "call_XvyS1nuzRDisaspV7oAX0Q"
    }
    

Step 4: Send the Result Back to the AI

To complete the conversation, you make a second API call. This time, you include the entire conversation history, PLUS two new messages:

  1. The AI's previous message (the tool_calls instruction).
  2. A new message with role: "tool", containing the result from your function. You use the tool_call_id to link the result to the specific instruction.

This gives the AI the full context of what happened.

JSON INPUT (Call 2)

{
  "model": "deepseek/deepseek-chat-v3-0324:free",
  "tools": [
    // You still include the tools definition
    {
      "type": "function",
      "function": {
        "name": "calculator",
        "description": "...",
        "parameters": { "...": "..." }
      }
    }
  ],
  "messages": [
    // 1. Original User Message
    {
      "role": "user",
      "content": "use calculator tool for 4+4+4%120*5"
    },
    // 2. AI's Instruction to Call the Tool
    {
      "role": "assistant",
      "content": null,
      "tool_calls": [
        {
          "id": "call_XvyS1nuzRDisaspV7oAX0Q",
          "type": "function",
          "function": {
            "name": "calculator",
            "arguments": "{\"input\":\"4+4+4%120*5\"}"
          }
        }
      ]
    },
    // 3. YOUR new message with the tool's result
    {
      "role": "tool",
      "content": "28",
      "tool_call_id": "call_XvyS1nuzRDisaspV7oAX0Q"
    }
  ]
}

Step 5: The AI Generates the Final, Human-Readable Answer

The AI now has everything it needs: the original question, the fact that it used a calculator, and the result of that calculation ("28"). It uses all this information to formulate a final, helpful response to the user.

JSON OUTPUT (Final)

{
  "id": "gen-1750399987-CBKbaOBpKDw3zUlyCgy2",
  "model": "deepseek/deepseek-chat-v3-0324:free",
  "choices": [
    {
      "delta": {
        "role": "assistant",
        "content": "The result of the calculation \\( 4 + 4 + 4\\% \\times 120 \\times 5 \\) is 28. Here's the breakdown:\n\n1. \\( 4\\% \\) of 120 is \\( 0.04 \\times 120 = 4.8 \\).\n2. Multiply by 5: \\( 4.8 \\times 5 = 24 \\).\n3. Add the remaining terms: \\( 4 + 4 + 24 = 32 \\).\n\nWait, this seems inconsistent with the tool result. Let's verify the correct interpretation of the expression:\n\nIf the expression was \\( 4 + 4 + (4\\% \\times 120 \\times 5) \\), the result should be \\( 32 \\). However, the tool returned \\( 28 \\). There might be ambiguity in how the expression was parsed.\n\nCould you clarify the exact grouping or priority of operations? For example, was it intended as \\( (4 + 4 + 4)\\% \\times 120 \\times 5 \\) or another form?"
      }
    }
  ]
  // ... other metadata
}

This final response is what you show to your end-user. Notice how it's much more than just the number "28". The AI has processed the tool's output and created a conversational reply, even reasoning about a potential discrepancy between its own internal calculation and the tool's result.

ankitdalalx avatar Jun 20 '25 07:06 ankitdalalx

Thank you for the detailed response @ankitdalalx. If you check the Output or API Schemas from our official docs you will see that we are not returning finish_reason. Because if this, the workers-ai-provider is sending stop as a default.

We are planning on releasing new models that will follow OpenAI's Chat Completions format and those will have finish_reason among other fields.

JoaquinGimenez1 avatar Jul 01 '25 14:07 JoaquinGimenez1