MineContext icon indicating copy to clipboard operation
MineContext copied to clipboard

[BUG]: Abnormal token consumption (3M+ tokens in 2 hours) with no UI feedback or summaries displayed

Open hungryDodo opened this issue 2 months ago • 3 comments

🐛 Bug description [Please make everyone to understand it]

I'm experiencing extremely high token consumption that far exceeds the documented usage patterns. According to the project's issue discussion https://github.com/volcengine/MineContext/issues/44#issuecomment-3393976330, normal daily consumption should be around 2-3 million tokens. However, after about 2 hours of usage, I've consumed over 3 million tokens, which seems abnormal. Additionally, the software interface shows no summaries or processing results, which suggests the tokens are being consumed without producing visible output.

This indicates either:

  1. A backend processing loop causing redundant API calls
  2. Configuration issues leading to excessive LLM requests
  3. A bug in the token usage monitoring/reporting system

The image below shows a screenshot of my token usage from the VolcEngine backend. It seems the data hasn't appeared yet, likely due to a delay in updates. However, I set a display limit of 3,000,000 for the model's token call volume, and the automatic pause function has been triggered. Therefore, the usage should have reached the set limit. I will check the console again tomorrow to see if the data has been updated.

Image

🧑‍💻 Step to reproduce

The software was installed and configured strictly in accordance with the steps outlined in the README file:

  1. Install and configure MineContext following the quick start guide
  2. Set up API keys for LLM providers
  3. Run MineContext normally for approximately 2 hours
  4. Check token consumption via provider dashboard
  5. Observe that UI shows no summaries or processing results despite high token usage

👾 Expected result

I expected that:

  • Token consumption for 2 hours should be proportional to daily usage (max ~500K-600K tokens for 2 hours if daily is 2-3M)
  • The UI should display summaries, insights, or processing results corresponding to the token consumption
  • Background processing should not consume tokens without producing user-visible output

🚑 Any additional information

  • Token consumption rate appears to be ~1.5M tokens per hour, which would result in ~36M tokens per day (far exceeding documented usage)
  • No error messages displayed in the UI
  • Application appears to be running normally from user perspective
  • Need to investigate monitoring logs and token usage tracking
  • Potential investigation areas: processing loops in context_processing pipeline, LLM client retry mechanisms, or capture manager triggering excessive API calls

🛠️ MineContext Version

0.1.1

💻 Platform Details

Operating System: macOS Tahoe 26.0.1

hungryDodo avatar Oct 12 '25 15:10 hungryDodo

Same for me, following the same step. Operating System: macOS 15.1.1 MineContext: 0.1.1

MicDZ avatar Oct 12 '25 19:10 MicDZ

Okay, let me try to explain my calculation regarding the 3M tokens. With 2 screens, from 10-12, 14-18, 19-23, that's roughly 10 hours. This is approximately the consumption. In fact, when we initially designed it, we intended it for work scenarios, so about 8 hours a day is sufficient. Therefore, there's no need to multiply by 24 hours.

Additionally, you might need to mention the number of simultaneous screenshot windows. If you have many windows open, it would indeed require a corresponding increase. Here, we recommend reducing the screenshot frequency from 5 seconds to 15 or 30 seconds.

Finally, if you can, open localhost:8000 to check the token usage. (Below image shows my yesterday token usage is 3M) Image

KashiwaByte101 avatar Oct 12 '25 23:10 KashiwaByte101

Hi @KashiwaByte101 ,

Thanks so much for the quick and detailed reply! I really appreciate you taking the time to explain the token estimation logic. Your calculation method is super helpful and provided a great baseline for me to dig a bit deeper into what's happening.

Following your logic, I did a quick calculation for my setup:

  • With 2 screens and a 5s interval (which is 3 times more frequent than the 15s in your example),
  • the expected usage would be around 375k tokens/hour * 3 = 1.125M tokens/hour.

I took another look at the detailed backend logs, and they seem to confirm that the consumption is indeed higher than expected. As you can see in the chart below, there was a peak usage of 2.59M tokens in a single hour (Oct 12, 22:00-23:00), which is more than double the adjusted estimate. This aligns with the bug I was seeing, where high consumption was happening without any corresponding UI output.

Image

On a related note, I found a new issue today (Oct 13th): the daily report wasn't generated as expected. The "Proactive Feed" still shows "No activity records found," which seems to be a symptom of the same underlying problem.

Image

Unfortunately, the localhost:8000/monitoring page only displays data for the last 24 hours, so the peak from yesterday is no longer visible there.

  • I will run another test shortly to try and reproduce the high-consumption state and will share the updated monitoring stats once I have them.
Image

Just a quick question while I'm debugging: is there a way to view the raw screenshots that the tool captures? I noticed the localhost:8000/vector_search endpoint and was curious if the images are converted directly into embeddings and then discarded, or if they are stored somewhere temporarily. Being able to see them might help in tracking down the issue.

Thanks again for your help and for building this awesome tool! Let me know if there are any specific logs or tests you'd like me to run.

hungryDodo avatar Oct 13 '25 10:10 hungryDodo