Better handling of "kubectl logs"
kubectl-ai --llm-provider azopenai --model gpt-4.1 --quiet <my-query>
Running: kubectl get pod -n <ns> -l app=<app> -o yaml
Running: kubectl logs <pod> -n <ns> --previous
Error: POST <azopenai endpoint>
--------------------------------------------------------------------------------
RESPONSE 400: 400 Bad Request
ERROR CODE: context_length_exceeded
--------------------------------------------------------------------------------
{
"error": {
"message": "This model's maximum context length is 1047576 tokens. However, your messages resulted in 1901308 tokens (1900600 in the messages, 708 in the functions). Please reduce the length of the messages or functions.",
"type": "invalid_request_error",
"param": "messages",
"code": "context_length_exceeded"
}
}
Not surprising that this could happen but was wondering if it's worth tackling this so that at least we don't have things crashing this way.
There can be more than one way to tackle this - both via code and system prompt. Having instructions to always run kubectl logs --tail <some reasonable number> could help as well. Do folks have more ideas, directional thoughts on how we should approach this?
cc @droot @hakman
For now, I would suggest limiting the info via kubectl logs args:
- --since - Only return logs newer than a relative duration like 30m
- --limit-bytes - Maximum bytes of logs to return. Defaults to no limit
- --container
Adding a filter for the kubectl logs command would also be something cool, as in most cases one will look for errors and warnings.
I like any kubectl logs args which can provide somewhat absolute amount of logs: --limit-bytes or --tail. Not sure about --since though as it again depends on how verbose the app is and even 30m can have enough logs to exceed the context. Any of this will be added via instructions (prompt) only, right?
slightly related: https://github.com/GoogleCloudPlatform/kubectl-ai/issues/249
I think this is expected behavior, and trying to mutate the default output of kubectl logs could lead to confusion. Instead, prompting the LLM to use an MCP server like filesystem to store logs for further troubleshooting in smaller batches might be a more effective approach.
slightly related: #249
Agreed with the thoughts in the linked issue. Would it still be worth to have a band-aid change to prevent this error scenario? Like kubectl logs is frequently run by the agent and without any --tail or --limit-bytes or some filter in the kubectl logs command, we are bound to run into this error with production workloads more often than not.
I think this is expected behavior, and trying to mutate the default output of
kubectl logscould lead to confusion.
Sure, mutating the default output from kubectl logs could lead to confusion but if the command itself is run with appropriate flags to limit the logs by the agent, then it will be obvious to the users, right? Besides we (humans) would also more likely run kubectl logs --tail on a production workload than a plain kubectl logs.
I also think we need this "AND" some MCP server like you suggested (or whatever comes out if we research what's employed by state of the art agents).
Mutating the output can be confusing—and mutating the original command (e.g., adding --tail if it’s missed) is even worse. Instead, we could consider an approach like: • Log level filtering • Log aggregation with one representative sample per template
This would help reduce both the cost and the overhead of excessive back-and-forth with the LLM, especially when dealing with large volumes of logs.