aider icon indicating copy to clipboard operation
aider copied to clipboard

Tokens leak?

Open ErykCh opened this issue 1 year ago • 10 comments

Issue

While checking the token consumption of the application I am writing (because it uses the LLM model), I noticed something strange. Over 11,000 tokens for a small change assigned to Aider. So I called /tokens and it showed 1792 tokens. As a test, I told Aider to make a simple change. And Aider sent 11301 tokens?

image

Version and model info

Aider v0.43.5-dev Models: openrouter/anthropic/claude-3.5-sonnet with diff edit format, weak model openrouter/anthropic/claude-3-haiku Git repo: .git with 46 files Repo-map: using 2028 tokens Added .aider.memory.md to the chat.

ErykCh avatar Jul 14 '24 22:07 ErykCh

Thanks for trying aider and filing this issue. It looks like you have identified a bug. The /tokens command is not showing the tokens used by the repo-map. I'll get this fixed.

paul-gauthier avatar Jul 17 '24 13:07 paul-gauthier

I don't think it's the only problem, because ok, right after running it happened that repo-map had more tokens than the --map-tokens setting. But the second query was usually below --map-tokens.

ErykCh avatar Jul 18 '24 12:07 ErykCh

The --map-tokens is a guideline, not a hard limit.

Sorry, I am unable to reproduce this. Are you able to make this happen again? If so, can you share the announce lines that aider prints on launch as well as the /tokens output from that same session?

Are you sure that the screenshot you shared of /tokens is from the same session as usage lines you highlighted in the openrouter screenshot?

Because the openrouter usage lines look like they were from a session where you used aider without adding any files to the chat and while operating inside a git repo. And the /tokens screenshot looks like you have added a file and are running aider without an active repomap because maybe you are not inside a git repo.

paul-gauthier avatar Jul 18 '24 15:07 paul-gauthier

Are you sure that the screenshot you shared of /tokens is from the same session as usage lines you highlighted in the openrouter screenshot?

yes

Because the openrouter usage lines look like they were from a session where you used aider without adding any files to the chat and while operating inside a git repo.

It seems that I have still this session open.

so let's delve into it

aider start and first task after I saw how many tokens it used, I called /tokens

image

so I repeated with another small change image

I did another test now. In the same session. on the beginning /tokens and you see repo-map than remove a file adding the same file from this ticket small change there is no repo and high token usage despite small file image

image

when I drop this file, it is on the list printed by /ls

image

ErykCh avatar Jul 19 '24 05:07 ErykCh

I closed the session, updated to the latest version of Aider.

Aider v0.45.1 Models: openrouter/anthropic/claude-3.5-sonnet with diff edit format, weak model openrouter/openai/gpt-4o-mini Git repo: .git with 46 files Repo-map: using 2028 tokens Added .aider.memory.md to the chat.

There is still no repo-map for this file.

image

ErykCh avatar Jul 19 '24 06:07 ErykCh

I've also ran into a similar issue. I can clear chat history and start a new session, and the first prompt immediately says it's using a million tokens. From my .aider.chat.history.md:

#### Example_File.ipynb for how coding could be improved. For example, could the plotting function be summarized and called instead of writing the full code each time?  

To improve the coding in `Example_File.ipynb`, we can focus on refactoring the plotting function to avoid repeating the full code each time. The most likely file that needs to be edited is:

- `Research\Example_File.ipynb`

Please add `Research\Example_File.ipynb` to the chat so I can propose specific changes.

< I manually add the file in the browser GUI, this line is not part of the actual  `.aider.chat.history.md` >

#### Approved.

>  
>  
> Model deepseek/deepseek-coder has hit a token limit!  
> Token counts below are approximate.  
>  
> Input tokens: ~1,204,876 of 128,000 -- possibly exhausted context window!  
> Output tokens: ~0 of 4,096  
> Total tokens: ~1,204,876 of 128,000 -- possibly exhausted context window!  

I noticed that it wasn't automatically adding files like it was before.

I also notice that neither .aider.chat.history.md nor .aider.input.history are actually empty files after I cleared chat history in the browser button. .aider.chat.history.md is quite big, 76 Kb/~2000 lines. Could this be relevant to the issue? Is it trying to send the entire .aider.chat.history.md in the prompt?

mr-september avatar Jul 26 '24 07:07 mr-september

@mr-september you should run /tokens to see what's using that 1M tokens. Did you add those .aider... files to your repo and to the chat?

paul-gauthier avatar Jul 30 '24 18:07 paul-gauthier

@ErykCh any chance you can share that specific file with me? Wondering if it is somehow causing the repo map to crash in the tokens display.

paul-gauthier avatar Jul 30 '24 18:07 paul-gauthier

/tokens

Approximate context window usage, in tokens:

$ 0.0002    1,208 system messages
$ 0.0007    4,703 repository map  use --map-tokens to resize
==================
$ 0.0008    5,911 tokens total
          122,089 tokens remaining in context window
          128,000 tokens max context window size

Not sure if this helps, but I just updated to v0.46.1. Immediately cleared chat history, and my first message input already uses 7000+ tokens. Does this seem a bit high?

#### Check Example.ipynb for how coding could be improved. For example, could the plotting function be summarized and called instead of writing the full code each time?  

To improve the coding in `Example.ipynb`, the most likely file that needs changes is:

- `Research\Example.ipynb`

Please add this file to the chat so I can propose specific changes.

> Tokens: 7,376 sent, 51 received. Cost: $0.0010 request, $0.0010 session.  

mr-september avatar Jul 31 '24 09:07 mr-september

This is expected when you are running with no added files. Aider uses a larger repo map, so that the LLM can thoroughly understand your code base. Since you haven't added any files, the LLM is encouraged to suggest which files need to be edited based on your chat requests. To make good suggestions, it helps to have a larger repo map.

paul-gauthier avatar Aug 02 '24 09:08 paul-gauthier

I'm going to close this issue for now, but feel free to add a comment here and I will re-open or file a new issue any time.

paul-gauthier avatar Aug 06 '24 13:08 paul-gauthier