kubectl-ai icon indicating copy to clipboard operation
kubectl-ai copied to clipboard

[Bug]: Tools not triggered by openai reasonning models

Open Lunik opened this issue 5 months ago • 3 comments

Environment (please complete the following):

  • OS: macOS 15.05
  • kubectl-ai version (run kubectl-ai version): 0.0.18
  • LLM provider: openai
  • LLM model: o4-mini

Describe the bug When using reasoning model like o4-mini, tool calls are never triggered.

To Reproduce Steps to reproduce the behavior:

  1. Start kubectl-ai with o4-mini models kubectl-ai_0.0.18 --llm-provider openai --model o4-mini-2025-04-16 --alsologtostderr

  2. ask him to list pods in 2 namespaces >>> list my pods in the two following namespaces app-dev01 and app-dev02

Full trace :

I0718 12:54:11.741204   46406 terminal.go:302] Sending readline input to agent: "list my pods in the two following namespaces app-dev01 and app-dev02"
I0718 12:54:11.741404   46406 conversation.go:315] "Received input from channel" userInput={"query":"list my pods in the two following namespaces app-dev01 and app-dev02"}
I0718 12:54:11.741477   46406 conversation.go:151] Agent state changing from idle to running
I0718 12:54:11.741499   46406 conversation.go:349] "Set agent state to running, will process agentic loop" currIteration=0 currChatContent=1
I0718 12:54:11.741513   46406 conversation.go:409] "Processing agentic loop" currIteration=0 maxIterations=1 currChatContentLen=1
I0718 12:54:11.741484   46406 terminal.go:152] agent output: &{ID:2e46a571-a86b-412c-b164-32b0c1a83a5f Source:user Type:text Payload:list my pods in the two following namespaces app-dev01 and app-dev02 Timestamp:2025-07-18 12:54:11.741441 +0200 CEST m=+5.957249459}
I0718 12:54:18.224615   46406 openai.go:332] Accumulator state: {ChatCompletion:{ID:chatcmpl-4b3e6ea6-11aa-4711-863b-045481734c15 Choices:[{FinishReason: Index:0 Logprobs:{Content:[] Refusal:[] JSON:{Content:{status:0 raw:} Refusal:{status:0 raw:} ExtraFields:map[] raw:}} Message:{Content: Refusal: Role:assistant Annotations:[] Audio:{ID: Data: ExpiresAt:0 Transcript: JSON:{ID:{status:0 raw:} Data:{status:0 raw:} ExpiresAt:{status:0 raw:} Transcript:{status:0 raw:} ExtraFields:map[] raw:}} FunctionCall:{Arguments: Name: JSON:{Arguments:{status:0 raw:} Name:{status:0 raw:} ExtraFields:map[] raw:}} ToolCalls:[{ID:call_giyhjTxCaLs9i6Xz2dZ93HBJ Function:{Arguments:{"command":"kubectl get pods --namespace=app-dev01\nkubectl get pods --namespace=app-dev02","modifies_resource":"no"} Name:kubectl JSON:{Arguments:{status:0 raw:} Name:{status:0 raw:} ExtraFields:map[] raw:}} Type:function JSON:{ID:{status:0 raw:} Function:{status:0 raw:} Type:{status:0 raw:} ExtraFields:map[] raw:}}] JSON:{Content:{status:0 raw:} Refusal:{status:0 raw:} Role:{status:0 raw:} Annotations:{status:0 raw:} Audio:{status:0 raw:} FunctionCall:{status:0 raw:} ToolCalls:{status:0 raw:} ExtraFields:map[] raw:}} JSON:{FinishReason:{status:0 raw:} Index:{status:0 raw:} Logprobs:{status:0 raw:} Message:{status:0 raw:} ExtraFields:map[] raw:}}] Created:1752836058 Model:o4-mini-2025-04-16 Object:chat.completion ServiceTier: SystemFingerprint: Usage:{CompletionTokens:0 PromptTokens:0 TotalTokens:0 CompletionTokensDetails:{AcceptedPredictionTokens:0 AudioTokens:0 ReasoningTokens:0 RejectedPredictionTokens:0 JSON:{AcceptedPredictionTokens:{status:0 raw:} AudioTokens:{status:0 raw:} ReasoningTokens:{status:0 raw:} RejectedPredictionTokens:{status:0 raw:} ExtraFields:map[] raw:}} PromptTokensDetails:{AudioTokens:0 CachedTokens:0 JSON:{AudioTokens:{status:0 raw:} CachedTokens:{status:0 raw:} ExtraFields:map[] raw:}} JSON:{CompletionTokens:{status:0 raw:} PromptTokens:{status:0 raw:} TotalTokens:{status:0 raw:} CompletionTokensDetails:{status:0 raw:} PromptTokensDetails:{status:0 raw:} ExtraFields:map[] raw:}} JSON:{ID:{status:0 raw:} Choices:{status:0 raw:} Created:{status:0 raw:} Model:{status:0 raw:} Object:{status:0 raw:} ServiceTier:{status:0 raw:} SystemFingerprint:{status:0 raw:} Usage:{status:0 raw:} ExtraFields:map[] raw:}} choiceChatCompletionStates:[{state:1 index:0}] justFinished:{state:0 index:0}}
I0718 12:54:18.225091   46406 openai.go:332] Accumulator state: {ChatCompletion:{ID:chatcmpl-4b3e6ea6-11aa-4711-863b-045481734c15 Choices:[{FinishReason:tool_calls Index:0 Logprobs:{Content:[] Refusal:[] JSON:{Content:{status:0 raw:} Refusal:{status:0 raw:} ExtraFields:map[] raw:}} Message:{Content: Refusal: Role:assistant Annotations:[] Audio:{ID: Data: ExpiresAt:0 Transcript: JSON:{ID:{status:0 raw:} Data:{status:0 raw:} ExpiresAt:{status:0 raw:} Transcript:{status:0 raw:} ExtraFields:map[] raw:}} FunctionCall:{Arguments: Name: JSON:{Arguments:{status:0 raw:} Name:{status:0 raw:} ExtraFields:map[] raw:}} ToolCalls:[{ID:call_giyhjTxCaLs9i6Xz2dZ93HBJ Function:{Arguments:{"command":"kubectl get pods --namespace=app-dev01\nkubectl get pods --namespace=app-dev02","modifies_resource":"no"} Name:kubectl JSON:{Arguments:{status:0 raw:} Name:{status:0 raw:} ExtraFields:map[] raw:}} Type:function JSON:{ID:{status:0 raw:} Function:{status:0 raw:} Type:{status:0 raw:} ExtraFields:map[] raw:}}] JSON:{Content:{status:0 raw:} Refusal:{status:0 raw:} Role:{status:0 raw:} Annotations:{status:0 raw:} Audio:{status:0 raw:} FunctionCall:{status:0 raw:} ToolCalls:{status:0 raw:} ExtraFields:map[] raw:}} JSON:{FinishReason:{status:0 raw:} Index:{status:0 raw:} Logprobs:{status:0 raw:} Message:{status:0 raw:} ExtraFields:map[] raw:}}] Created:1752836058 Model:o4-mini-2025-04-16 Object:chat.completion ServiceTier: SystemFingerprint: Usage:{CompletionTokens:0 PromptTokens:0 TotalTokens:0 CompletionTokensDetails:{AcceptedPredictionTokens:0 AudioTokens:0 ReasoningTokens:0 RejectedPredictionTokens:0 JSON:{AcceptedPredictionTokens:{status:0 raw:} AudioTokens:{status:0 raw:} ReasoningTokens:{status:0 raw:} RejectedPredictionTokens:{status:0 raw:} ExtraFields:map[] raw:}} PromptTokensDetails:{AudioTokens:0 CachedTokens:0 JSON:{AudioTokens:{status:0 raw:} CachedTokens:{status:0 raw:} ExtraFields:map[] raw:}} JSON:{CompletionTokens:{status:0 raw:} PromptTokens:{status:0 raw:} TotalTokens:{status:0 raw:} CompletionTokensDetails:{status:0 raw:} PromptTokensDetails:{status:0 raw:} ExtraFields:map[] raw:}} JSON:{ID:{status:0 raw:} Choices:{status:0 raw:} Created:{status:0 raw:} Model:{status:0 raw:} Object:{status:0 raw:} ServiceTier:{status:0 raw:} SystemFingerprint:{status:0 raw:} Usage:{status:0 raw:} ExtraFields:map[] raw:}} choiceChatCompletionStates:[{state:4 index:0}] justFinished:{state:1 index:0}}
I0718 12:54:18.226163   46406 openai.go:332] Accumulator state: {ChatCompletion:{ID:chatcmpl-4b3e6ea6-11aa-4711-863b-045481734c15 Choices:[{FinishReason: Index:0 Logprobs:{Content:[] Refusal:[] JSON:{Content:{status:0 raw:} Refusal:{status:0 raw:} ExtraFields:map[] raw:}} Message:{Content: Refusal: Role:assistant Annotations:[] Audio:{ID: Data: ExpiresAt:0 Transcript: JSON:{ID:{status:0 raw:} Data:{status:0 raw:} ExpiresAt:{status:0 raw:} Transcript:{status:0 raw:} ExtraFields:map[] raw:}} FunctionCall:{Arguments: Name: JSON:{Arguments:{status:0 raw:} Name:{status:0 raw:} ExtraFields:map[] raw:}} ToolCalls:[{ID:call_giyhjTxCaLs9i6Xz2dZ93HBJ Function:{Arguments:{"command":"kubectl get pods --namespace=app-dev01\nkubectl get pods --namespace=app-dev02","modifies_resource":"no"} Name:kubectl JSON:{Arguments:{status:0 raw:} Name:{status:0 raw:} ExtraFields:map[] raw:}} Type:function JSON:{ID:{status:0 raw:} Function:{status:0 raw:} Type:{status:0 raw:} ExtraFields:map[] raw:}}] JSON:{Content:{status:0 raw:} Refusal:{status:0 raw:} Role:{status:0 raw:} Annotations:{status:0 raw:} Audio:{status:0 raw:} FunctionCall:{status:0 raw:} ToolCalls:{status:0 raw:} ExtraFields:map[] raw:}} JSON:{FinishReason:{status:0 raw:} Index:{status:0 raw:} Logprobs:{status:0 raw:} Message:{status:0 raw:} ExtraFields:map[] raw:}}] Created:1752836058 Model:o4-mini-2025-04-16 Object:chat.completion ServiceTier: SystemFingerprint: Usage:{CompletionTokens:624 PromptTokens:1797 TotalTokens:2421 CompletionTokensDetails:{AcceptedPredictionTokens:0 AudioTokens:0 ReasoningTokens:0 RejectedPredictionTokens:0 JSON:{AcceptedPredictionTokens:{status:0 raw:} AudioTokens:{status:0 raw:} ReasoningTokens:{status:0 raw:} RejectedPredictionTokens:{status:0 raw:} ExtraFields:map[] raw:}} PromptTokensDetails:{AudioTokens:0 CachedTokens:0 JSON:{AudioTokens:{status:0 raw:} CachedTokens:{status:0 raw:} ExtraFields:map[] raw:}} JSON:{CompletionTokens:{status:0 raw:} PromptTokens:{status:0 raw:} TotalTokens:{status:0 raw:} CompletionTokensDetails:{status:0 raw:} PromptTokensDetails:{status:0 raw:} ExtraFields:map[] raw:}} JSON:{ID:{status:0 raw:} Choices:{status:0 raw:} Created:{status:0 raw:} Model:{status:0 raw:} Object:{status:0 raw:} ServiceTier:{status:0 raw:} SystemFingerprint:{status:0 raw:} Usage:{status:0 raw:} ExtraFields:map[] raw:}} choiceChatCompletionStates:[{state:4 index:0}] justFinished:{state:0 index:0}}
I0718 12:54:18.228261   46406 conversation.go:498] "streamedText" streamedText=""
I0718 12:54:18.228289   46406 conversation.go:505] "No function calls to be made, so most likely the task is completed, so we're done."
I0718 12:54:18.228298   46406 conversation.go:151] Agent state changing from running to done
I0718 12:54:18.228311   46406 conversation.go:510] "Agent task completed, transitioning to done state"
I0718 12:54:18.228323   46406 conversation.go:299] "Agent loop iteration" state="done"
I0718 12:54:18.228330   46406 conversation.go:308] "initiating user input"
I0718 12:54:18.228387   46406 terminal.go:152] agent output: &{ID:6b9f5c2d-177b-4360-97a2-cb9d12f2497e Source:agent Type:user-input-request Payload:>>> Timestamp:2025-07-18 12:54:18.228352 +0200 CEST m=+12.444213376}
I0718 12:54:18.228412   46406 terminal.go:258] Received user input request with payload: ">>>"

Actual behavior The LLM call a tool

Accumulator content

ToolCalls: [{ID:call_giyhjTxCaLs9i6Xz2dZ93HBJ Function:{Arguments:{"command":"kubectl get pods --namespace=app-dev01\nkubectl get pods --namespace=app-dev02","modifies_resource":"no"} Name:kubectl JSON:{Arguments:{status:0 raw:} Name:{status:0 raw:} ExtraFields:map[] raw:}} Type:function JSON:{ID:{status:0 raw:} Function:{status:0 raw:} Type:{status:0 raw:} ExtraFields:map[] raw:}}] JSON:{Content:{status:0 raw:} Refusal:{status:0 raw:} Role:{status:0 raw:} Annotations:{status:0 raw:} Audio:{status:0 raw:} FunctionCall:{status:0 raw:} ToolCalls:{status:0 raw:} ExtraFields:map[] raw:}} JSON:{FinishReason:{status:0 raw:} Index:{status:0 raw:} Logprobs:{status:0 raw:} Message:{status:0 raw:} ExtraFields:map[] raw:}}]

But kubect-ai didn't correctly detect it : No function calls to be made, so most likely the task is completed, so we're done.

Expected behavior The tool should be called

Additional context I have added extra debugging on the accumulator

klog.Infof("Accumulator state: %+v", acc)

The intercepted call response :

data: {"id":"chatcmpl-4b3e6ea6-11aa-4711-863b-045481734c15","choices":[{"delta":{"content":"","role":"assistant","tool_calls":[{"index":0,"id":"call_giyhjTxCaLs9i6Xz2dZ93HBJ","function":{"arguments":"{\"command\":\"kubectl get pods --namespace=app-dev01\\nkubectl get pods --namespace=app-dev02\",\"modifies_resource\":\"no\"}","name":"kubectl"},"type":"function"}]},"index":0}],"created":1752836058,"model":"o4-mini-2025-04-16","object":"chat.completion.chunk","stream_options":{"include_usage":true}}

data: {"id":"chatcmpl-4b3e6ea6-11aa-4711-863b-045481734c15","choices":[{"delta":{},"finish_reason":"tool_calls","index":0}],"created":1752836058,"model":"o4-mini-2025-04-16","object":"chat.completion.chunk","stream_options":{"include_usage":true}}

data: {"id":"chatcmpl-4b3e6ea6-11aa-4711-863b-045481734c15","choices":[{"delta":{},"index":0}],"created":1752836058,"model":"o4-mini-2025-04-16","object":"chat.completion.chunk","usage":{"completion_tokens":624,"prompt_tokens":1797,"total_tokens":2421,"completion_tokens_details":{"reasoning_tokens":0}},"stream_options":{"include_usage":true}}

data: [DONE]

You can see that there is an odd response from the LLM API. The first delta contains content and tool_calls attributes at the same times. Witch never happens with classical models like gpt-4.1

The issue comes from this part of the code : https://github.com/GoogleCloudPlatform/kubectl-ai/blob/v0.0.18/gollm/openai.go#L326-L390

acc.JustFinishedToolCall() doesn't correctly return that there is a tool call so we never add it to the response

Lunik avatar Jul 18 '25 11:07 Lunik

/cc @tuannvm @zvdy Also I wonder if we should bump up the openai deps to ensure we are using the latest libraries.

droot avatar Jul 18 '25 18:07 droot

/cc @tuannvm @zvdy Also I wonder if we should bump up the openai deps to ensure we are using the latest libraries.

Probably yes, planned on reviewing #415 and see openai changelog to see if something might break which shouldn't be the case but feel free to take it @tuannvm quite busy at the moment

zvdy avatar Jul 18 '25 19:07 zvdy

I upgraded the library and added a couple more unit tests for validation: https://github.com/GoogleCloudPlatform/kubectl-ai/pull/426.

Also, I ran a quick model test, and it seems we should have a recommended list of models to use.

| Model         | Test Result | Failure Reason                                                                                   |
|---------------|-------------|------------------------------------------------------------------------------------------------|
| gpt-4.1       | Success     | N/A                                                                                            |
| gpt-4.1-mini  | Success     | N/A                                                                                            |
| gpt-4.1-nano  | Success     | N/A                                                                                            |
| gpt-4o        | Success     | N/A                                                                                            |
| gpt-4o-mini   | Failure     | Command execution issue ("get namespaces" command failed)                                      |
| gpt-4-turbo   | Success     | N/A                                                                                            |
| o3            | Success     | N/A                                                                                            |
| o3-pro        | Failure     | 404 Not Found error: Model not supported on chat completions endpoint; should use v1/completions |
| o4-mini       | Success     | N/A                                                                                            |
| o3-mini       | Success     | N/A                                                                                            |
| codex-mini    | Failure     | 404 Not Found error: Model does not exist or no access                                         |
| o1            | Success     | N/A                                                                                            |
| o1-mini       | (No result) | No test result output provided                                                                 |

tuannvm avatar Jul 19 '25 07:07 tuannvm