bug: Sending more tokens than it should
Describe the bug
The prompt is sent with more than 12k tokens when I type just "say hi". The first image is the prompt and the second are the logs in Gorq dashboard. The last 2 lines were tests I made using curl and everything worked properly. I'm sending my config. I changed the token limits a few times but the result was always the same.
To reproduce
{
"yetone/avante.nvim",
event = "VeryLazy",
version = false, -- Never set this value to "*"! Never!
opts = {
provider = "groq",
cursor_applying_provider = "groq",
behaviour = {
enable_cursor_planning_mode = true, -- enable cursor planning mode!
},
providers = {
groq = { -- define groq provider
__inherited_from = 'openai',
api_key_name = 'GROQ_API_KEY',
endpoint = 'https://api.groq.com/openai/v1/',
model = 'llama-3.3-70b-versatile',
timeout = 30000,
extra_request_body = {
max_completion_tokens = 50000, -- remember to increase this value, otherwise it will stop generating halfway
max_tokens = 12000,
}
},
},
},
dependencies = {
"nvim-lua/plenary.nvim",
"MunifTanjim/nui.nvim",
--- The below dependencies are optional,
"nvim-telescope/telescope.nvim", -- for file_selector provider telescope
"hrsh7th/nvim-cmp", -- autocompletion for avante commands and mentions
"stevearc/dressing.nvim", -- for input provider dressing
"nvim-tree/nvim-web-devicons", -- or echasnovski/mini.icons
{
'MeanderingProgrammer/render-markdown.nvim',
opts = {
file_types = { "markdown", "Avante" },
},
ft = { "markdown", "Avante" },
},
},
Expected behavior
No response
Installation method
Use lazy.nvim:
The build function from the Readme example didn't work for me, so I went into the directory .local/share/nvim/avante.nvim and ran make
Environment
nvim: v0.11.2 Distribution: Arch
Repro
vim.env.LAZY_STDPATH = ".repro"
load(vim.fn.system("curl -s https://raw.githubusercontent.com/folke/lazy.nvim/main/bootstrap.lua"))()
require("lazy.minit").repro({
spec = {
-- add any other plugins here
{
"yetone/avante.nvim",
event = "VeryLazy",
version = false, -- Never set this value to "*"! Never!
opts = {
provider = "groq",
cursor_applying_provider = "groq",
behaviour = {
enable_cursor_planning_mode = true, -- enable cursor planning mode!
},
providers = {
groq = { -- define groq provider
__inherited_from = 'openai',
api_key_name = 'GROQ_API_KEY',
endpoint = 'https://api.groq.com/openai/v1/',
model = 'llama-3.3-70b-versatile',
timeout = 30000,
extra_request_body = {
max_completion_tokens = 50000, -- remember to increase this value, otherwise it will stop generating halfway
max_tokens = 12000,
}
},
},
},
dependencies = {
"nvim-lua/plenary.nvim",
"MunifTanjim/nui.nvim",
--- The below dependencies are optional,
"nvim-telescope/telescope.nvim", -- for file_selector provider telescope
"hrsh7th/nvim-cmp", -- autocompletion for avante commands and mentions
"stevearc/dressing.nvim", -- for input provider dressing
"nvim-tree/nvim-web-devicons", -- or echasnovski/mini.icons
{
'MeanderingProgrammer/render-markdown.nvim',
opts = {
file_types = { "markdown", "Avante" },
},
ft = { "markdown", "Avante" },
},
},
}
}
})
I can confirm It took 15k for "hi" Also for 2 more complex promts it took more that 1M I use claude/claude-sonnet-4-20250514
Spent like $5 on claude sonnet after 30m of simple use, I assume it's tied to this
Same here.
same
#2436
same
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This is still happening, 10k tokens for "hi".
lol i just wasted 12k tokens on a "hi"
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
Any news?
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This still happens, even with Deepseek private API. I used at least 4 times but the api calls are 137 times and more than 1 MIllions tokens on some request. The request I made was only 1 or 2 medium html and javascript files
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
Still happens
any clue on this?? i prefer avante over CodeCompanion because of chat history retaining, but token consmption is crazy, just used 22k tokens on devstral through openrouter. 11k thinking which resulted in "Ok, I understand" and 11k for a normal response to a "hi!".
I realized it was sending a lot of tokens when sending "hello world" to a local llama.cpp instance failed because it "exceeded the context". From my quick debugging sessions, it's the description of tools in avante that adds tons of tokens. Either disable tools or maybe override the default prompts with override_prompt_dir