kilocode VS Code LM API uses too many requests

Description

When using VS Code LM API provider the Tool or assistant calls are not being identified correctly and are counting against the request quota.

To reproduce, simply make any prompt that initiates a tool call (e.g. "summarize the contents of this folder") and see excessive usage of request.

Instead, it should be setting the "X-Initiator" header to "agent"

See similar issue in opencode: https://github.com/sst/opencode/issues/430 And the fix: https://github.com/sst/opencode/pull/595

Jul 03 '25 17:07 jchadwick

Thank you for the report! Sadly, VS Code LM API is in "very experimental" state, so I'm not surprised at all.

Which provider/model are you using?

Jul 04 '25 12:07 HadesArchitect

Thank you for the report! Sadly, VS Code LM API is in "very experimental" state, so I'm not surprised at all.

Which provider/model are you using?

The model doesn't seem to matter. I've tried GPT 4.1, o3, and Gemini Pro

Jul 04 '25 14:07 jchadwick

I'm not seeing the same behavior. I spent much of the day today and yesterday using the LM API based mechanism with kilo and I'm not seeing anything recorded as premium request usage when hitting the included models.

Jul 08 '25 03:07 mcowger

I'm not seeing the same behavior. I spent much of the day today and yesterday using the LM API based mechanism with kilo and I'm not seeing anything recorded as premium request usage when hitting the included models.

Jul 08 '25 03:07 mcowger

Forget I said the included model GPT 4.1. You're probably right that that doesn't move the needle - this is kind of an inexact process.

Jul 08 '25 15:07 jchadwick

I mean, I submitted literally hundreds of requests yesterday via the LM api to 4.1 and 4o models. I would have blown out my premium limit.

Jul 09 '25 00:07 mcowger

I mean, I submitted literally hundreds of requests yesterday via the LM api to 4.1 and 4o models. I would have blown out my premium limit.

4.1 and 4o are not counted as premium by github copilot. things like claude, gemini or gpt-5 will reduce the premium quota.

Aug 18 '25 14:08 cuipengfei

I used the Copilot Premium model Claude 4 to ask about the purpose of a simple tsconfig.json file with 26 lines, and Kilo used two premium requests. When it could have used only 1. I believe one was used for AI response, and the second for task completion. I think that task completions could be utilized with other cost-efficient models that can aggregate the results of a task. Perhaps in Kilo settings, users could set a particular model of their choice for task completion?

Aug 28 '25 12:08 deyil