avante.nvim icon indicating copy to clipboard operation
avante.nvim copied to clipboard

bug: Sending more tokens than it should

Open WesleyMPG opened this issue 5 months ago • 13 comments

Describe the bug

The prompt is sent with more than 12k tokens when I type just "say hi". The first image is the prompt and the second are the logs in Gorq dashboard. The last 2 lines were tests I made using curl and everything worked properly. I'm sending my config. I changed the token limits a few times but the result was always the same.

Image Image

To reproduce

{
  "yetone/avante.nvim",
    event = "VeryLazy",
    version = false, -- Never set this value to "*"! Never!
    opts = {
      provider = "groq",
      cursor_applying_provider = "groq",
      behaviour = {
        enable_cursor_planning_mode = true, -- enable cursor planning mode!
      },
      providers = {
        groq = { -- define groq provider
          __inherited_from = 'openai',
          api_key_name = 'GROQ_API_KEY',
          endpoint = 'https://api.groq.com/openai/v1/',
          model = 'llama-3.3-70b-versatile',
          timeout = 30000,
          extra_request_body = {
            max_completion_tokens = 50000, -- remember to increase this value, otherwise it will stop generating halfway
            max_tokens = 12000,
          }
        },
      },
    },
    dependencies = {
      "nvim-lua/plenary.nvim",
      "MunifTanjim/nui.nvim",
      --- The below dependencies are optional,
      "nvim-telescope/telescope.nvim", -- for file_selector provider telescope
      "hrsh7th/nvim-cmp", -- autocompletion for avante commands and mentions
      "stevearc/dressing.nvim", -- for input provider dressing
      "nvim-tree/nvim-web-devicons", -- or echasnovski/mini.icons
      {
        'MeanderingProgrammer/render-markdown.nvim',
        opts = {
          file_types = { "markdown", "Avante" },
        },
        ft = { "markdown", "Avante" },
      },
    },

Expected behavior

No response

Installation method

Use lazy.nvim:

The build function from the Readme example didn't work for me, so I went into the directory .local/share/nvim/avante.nvim and ran make

Environment

nvim: v0.11.2 Distribution: Arch

Repro

vim.env.LAZY_STDPATH = ".repro"
load(vim.fn.system("curl -s https://raw.githubusercontent.com/folke/lazy.nvim/main/bootstrap.lua"))()

require("lazy.minit").repro({
  spec = {
    -- add any other plugins here
    {
      "yetone/avante.nvim",
        event = "VeryLazy",
        version = false, -- Never set this value to "*"! Never!
        opts = {
          provider = "groq",
          cursor_applying_provider = "groq",
          behaviour = {
            enable_cursor_planning_mode = true, -- enable cursor planning mode!
          },
          providers = {
            groq = { -- define groq provider
              __inherited_from = 'openai',
              api_key_name = 'GROQ_API_KEY',
              endpoint = 'https://api.groq.com/openai/v1/',
              model = 'llama-3.3-70b-versatile',
              timeout = 30000,
              extra_request_body = {
                max_completion_tokens = 50000, -- remember to increase this value, otherwise it will stop generating halfway
                max_tokens = 12000,
              }
            },
          },
        },
        dependencies = {
          "nvim-lua/plenary.nvim",
          "MunifTanjim/nui.nvim",
          --- The below dependencies are optional,
          "nvim-telescope/telescope.nvim", -- for file_selector provider telescope
          "hrsh7th/nvim-cmp", -- autocompletion for avante commands and mentions
          "stevearc/dressing.nvim", -- for input provider dressing
          "nvim-tree/nvim-web-devicons", -- or echasnovski/mini.icons
          {
            'MeanderingProgrammer/render-markdown.nvim',
            opts = {
              file_types = { "markdown", "Avante" },
            },
            ft = { "markdown", "Avante" },
          },
        },
  }
}
})

WesleyMPG avatar Jul 08 '25 03:07 WesleyMPG

I can confirm It took 15k for "hi" Also for 2 more complex promts it took more that 1M I use claude/claude-sonnet-4-20250514

Antilamer avatar Jul 10 '25 12:07 Antilamer

Spent like $5 on claude sonnet after 30m of simple use, I assume it's tied to this

andrhlt avatar Jul 12 '25 03:07 andrhlt

Same here.

RHansenSmith avatar Jul 12 '25 11:07 RHansenSmith

same

chojs23 avatar Jul 13 '25 07:07 chojs23

#2436

addadi avatar Jul 13 '25 19:07 addadi

same

saidbenmoumen avatar Jul 14 '25 12:07 saidbenmoumen

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Aug 14 '25 02:08 github-actions[bot]

This is still happening, 10k tokens for "hi".

bcampolo avatar Aug 18 '25 13:08 bcampolo

lol i just wasted 12k tokens on a "hi"

andrwui avatar Aug 31 '25 21:08 andrwui

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Oct 01 '25 02:10 github-actions[bot]

Any news?

WesleyMPG avatar Oct 01 '25 02:10 WesleyMPG

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Nov 01 '25 02:11 github-actions[bot]

Image This still happens, even with Deepseek private API. I used at least 4 times but the api calls are 137 times and more than 1 MIllions tokens on some request. The request I made was only 1 or 2 medium html and javascript files

AinRuizDorado avatar Nov 06 '25 00:11 AinRuizDorado

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Dec 07 '25 02:12 github-actions[bot]

Still happens

bcampolo avatar Dec 07 '25 18:12 bcampolo

any clue on this?? i prefer avante over CodeCompanion because of chat history retaining, but token consmption is crazy, just used 22k tokens on devstral through openrouter. 11k thinking which resulted in "Ok, I understand" and 11k for a normal response to a "hi!".

Image Image

andrwui avatar Dec 17 '25 15:12 andrwui

I realized it was sending a lot of tokens when sending "hello world" to a local llama.cpp instance failed because it "exceeded the context". From my quick debugging sessions, it's the description of tools in avante that adds tons of tokens. Either disable tools or maybe override the default prompts with override_prompt_dir

teto avatar Dec 17 '25 16:12 teto