avante.nvim bug: Sending more tokens than it should

Describe the bug

The prompt is sent with more than 12k tokens when I type just "say hi". The first image is the prompt and the second are the logs in Gorq dashboard. The last 2 lines were tests I made using curl and everything worked properly. I'm sending my config. I changed the token limits a few times but the result was always the same.

To reproduce

{
  "yetone/avante.nvim",
    event = "VeryLazy",
    version = false, -- Never set this value to "*"! Never!
    opts = {
      provider = "groq",
      cursor_applying_provider = "groq",
      behaviour = {
        enable_cursor_planning_mode = true, -- enable cursor planning mode!
      },
      providers = {
        groq = { -- define groq provider
          __inherited_from = 'openai',
          api_key_name = 'GROQ_API_KEY',
          endpoint = 'https://api.groq.com/openai/v1/',
          model = 'llama-3.3-70b-versatile',
          timeout = 30000,
          extra_request_body = {
            max_completion_tokens = 50000, -- remember to increase this value, otherwise it will stop generating halfway
            max_tokens = 12000,
          }
        },
      },
    },
    dependencies = {
      "nvim-lua/plenary.nvim",
      "MunifTanjim/nui.nvim",
      --- The below dependencies are optional,
      "nvim-telescope/telescope.nvim", -- for file_selector provider telescope
      "hrsh7th/nvim-cmp", -- autocompletion for avante commands and mentions
      "stevearc/dressing.nvim", -- for input provider dressing
      "nvim-tree/nvim-web-devicons", -- or echasnovski/mini.icons
      {
        'MeanderingProgrammer/render-markdown.nvim',
        opts = {
          file_types = { "markdown", "Avante" },
        },
        ft = { "markdown", "Avante" },
      },
    },

Expected behavior

No response

Installation method

Use lazy.nvim:

The build function from the Readme example didn't work for me, so I went into the directory .local/share/nvim/avante.nvim and ran make

Environment

nvim: v0.11.2 Distribution: Arch

Repro

vim.env.LAZY_STDPATH = ".repro"
load(vim.fn.system("curl -s https://raw.githubusercontent.com/folke/lazy.nvim/main/bootstrap.lua"))()

require("lazy.minit").repro({
  spec = {
    -- add any other plugins here
    {
      "yetone/avante.nvim",
        event = "VeryLazy",
        version = false, -- Never set this value to "*"! Never!
        opts = {
          provider = "groq",
          cursor_applying_provider = "groq",
          behaviour = {
            enable_cursor_planning_mode = true, -- enable cursor planning mode!
          },
          providers = {
            groq = { -- define groq provider
              __inherited_from = 'openai',
              api_key_name = 'GROQ_API_KEY',
              endpoint = 'https://api.groq.com/openai/v1/',
              model = 'llama-3.3-70b-versatile',
              timeout = 30000,
              extra_request_body = {
                max_completion_tokens = 50000, -- remember to increase this value, otherwise it will stop generating halfway
                max_tokens = 12000,
              }
            },
          },
        },
        dependencies = {
          "nvim-lua/plenary.nvim",
          "MunifTanjim/nui.nvim",
          --- The below dependencies are optional,
          "nvim-telescope/telescope.nvim", -- for file_selector provider telescope
          "hrsh7th/nvim-cmp", -- autocompletion for avante commands and mentions
          "stevearc/dressing.nvim", -- for input provider dressing
          "nvim-tree/nvim-web-devicons", -- or echasnovski/mini.icons
          {
            'MeanderingProgrammer/render-markdown.nvim',
            opts = {
              file_types = { "markdown", "Avante" },
            },
            ft = { "markdown", "Avante" },
          },
        },
  }
}
})

Jul 08 '25 03:07 WesleyMPG

I can confirm It took 15k for "hi" Also for 2 more complex promts it took more that 1M I use claude/claude-sonnet-4-20250514

Jul 10 '25 12:07 Antilamer

Spent like $5 on claude sonnet after 30m of simple use, I assume it's tied to this

Jul 12 '25 03:07 andrhlt

Same here.

Jul 12 '25 11:07 RHansenSmith

same

Jul 13 '25 07:07 chojs23

#2436

Jul 13 '25 19:07 addadi

same

Jul 14 '25 12:07 saidbenmoumen

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Aug 14 '25 02:08 github-actions[bot]

This is still happening, 10k tokens for "hi".

Aug 18 '25 13:08 bcampolo

lol i just wasted 12k tokens on a "hi"

Aug 31 '25 21:08 andrwui

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Oct 01 '25 02:10 github-actions[bot]

Any news?

Oct 01 '25 02:10 WesleyMPG

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Nov 01 '25 02:11 github-actions[bot]

This still happens, even with Deepseek private API. I used at least 4 times but the api calls are 137 times and more than 1 MIllions tokens on some request. The request I made was only 1 or 2 medium html and javascript files

Nov 06 '25 00:11 AinRuizDorado

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Dec 07 '25 02:12 github-actions[bot]

Still happens

Dec 07 '25 18:12 bcampolo

any clue on this?? i prefer avante over CodeCompanion because of chat history retaining, but token consmption is crazy, just used 22k tokens on devstral through openrouter. 11k thinking which resulted in "Ok, I understand" and 11k for a normal response to a "hi!".

Dec 17 '25 15:12 andrwui

I realized it was sending a lot of tokens when sending "hello world" to a local llama.cpp instance failed because it "exceeded the context". From my quick debugging sessions, it's the description of tools in avante that adds tons of tokens. Either disable tools or maybe override the default prompts with override_prompt_dir

Dec 17 '25 16:12 teto