kilocode icon indicating copy to clipboard operation
kilocode copied to clipboard

VS Code LM API uses too many requests

Open jchadwick opened this issue 5 months ago • 8 comments

Description

When using VS Code LM API provider the Tool or assistant calls are not being identified correctly and are counting against the request quota.

To reproduce, simply make any prompt that initiates a tool call (e.g. "summarize the contents of this folder") and see excessive usage of request.

Instead, it should be setting the "X-Initiator" header to "agent"

See similar issue in opencode: https://github.com/sst/opencode/issues/430 And the fix: https://github.com/sst/opencode/pull/595

jchadwick avatar Jul 03 '25 17:07 jchadwick

Thank you for the report! Sadly, VS Code LM API is in "very experimental" state, so I'm not surprised at all.

Which provider/model are you using?

Image

HadesArchitect avatar Jul 04 '25 12:07 HadesArchitect

Thank you for the report! Sadly, VS Code LM API is in "very experimental" state, so I'm not surprised at all.

Which provider/model are you using?

Image

The model doesn't seem to matter. I've tried GPT 4.1, o3, and Gemini Pro

jchadwick avatar Jul 04 '25 14:07 jchadwick

I'm not seeing the same behavior. I spent much of the day today and yesterday using the LM API based mechanism with kilo and I'm not seeing anything recorded as premium request usage when hitting the included models.

mcowger avatar Jul 08 '25 03:07 mcowger

I'm not seeing the same behavior. I spent much of the day today and yesterday using the LM API based mechanism with kilo and I'm not seeing anything recorded as premium request usage when hitting the included models.

mcowger avatar Jul 08 '25 03:07 mcowger

Forget I said the included model GPT 4.1. You're probably right that that doesn't move the needle - this is kind of an inexact process.

jchadwick avatar Jul 08 '25 15:07 jchadwick

I mean, I submitted literally hundreds of requests yesterday via the LM api to 4.1 and 4o models. I would have blown out my premium limit.

mcowger avatar Jul 09 '25 00:07 mcowger

I mean, I submitted literally hundreds of requests yesterday via the LM api to 4.1 and 4o models. I would have blown out my premium limit.

4.1 and 4o are not counted as premium by github copilot. things like claude, gemini or gpt-5 will reduce the premium quota.

cuipengfei avatar Aug 18 '25 14:08 cuipengfei

I used the Copilot Premium model Claude 4 to ask about the purpose of a simple tsconfig.json file with 26 lines, and Kilo used two premium requests. When it could have used only 1. I believe one was used for AI response, and the second for task completion. I think that task completions could be utilized with other cost-efficient models that can aggregate the results of a task. Perhaps in Kilo settings, users could set a particular model of their choice for task completion?

deyil avatar Aug 28 '25 12:08 deyil