opencode autocompaction

noticed this kind of thing https://github.com/sst/opencode/issues/1092 where prompts are quickly too long, and this way it autocompacts if its over roughly 80k tokens

Jul 30 '25 03:07 lee101

also fixes this one https://github.com/sst/opencode/issues/970

Jul 30 '25 08:07 lee101

ah no i still get this AI_APICallError: prompt token count of 151316 exceeds the limit of 128000

going to try fix this up so if it ever gets this kind of error it does a /compact as well

Jul 30 '25 10:07 lee101

@lee101 tbh i don't know if auto-compacting is a good idea in this case, it will randomly break the agent loop and that's probably not what the user wants to see in the middle of the tool call

and, if that's the direction opencode wants to take, 80k seems a bit too low anyways - you effectively kill large context size (e.g. for "how does this thing work?" prompts to 1Mil gemini)

Jul 30 '25 10:07 BOTKooper

yea needs to be adaptable per model really like if theres only 10k tokens left for a given model, then compact

Just trying another strategy that instead catches the error where a given LLM complains and decides to compact at that stage instead

shameful plug that im also an API provider and have an open source summarization tool https://text-generator.io/docs https://github.com/TextGeneratorio/text-generator.io/blob/main/questions/summarization.py that accepts a max_length so gives you a summarization API with some control of where to cutoff. could probably be something you do on device too as summarization /compaction can take a long time/lot of tokens but probably doesnt need to be at a high quality

Jul 30 '25 11:07 lee101

regarding 80k specifically - we have info about which model is being used and it's limits (thx models.dev), maybe smth like this would make more sense

diff --git a/packages/opencode/src/session/index.ts b/packages/opencode/src/session/index.ts
index 5c0cf83a..33c7f383 100644
--- a/packages/opencode/src/session/index.ts
+++ b/packages/opencode/src/session/index.ts
@@ -46,7 +46,7 @@ export namespace Session {
   const log = Log.create({ service: "session" })
 
   const OUTPUT_TOKEN_MAX = 32_000
-  const AUTO_COMPACT_TOKEN_THRESHOLD = 80_000
+  const AUTO_COMPACT_TOKEN_THRESHOLD_PERCENTAGE = 0.8
 
   function estimateTokensFromMessages(messages: { info: MessageV2.Info; parts: MessageV2.Part[] }[]): number {
     let totalChars = 0
@@ -633,8 +633,11 @@ export namespace Session {
 
     // auto compact if estimated tokens exceed 80k threshold
     const estimatedTokens = estimateTokensFromMessages(msgs)
-    if (estimatedTokens > AUTO_COMPACT_TOKEN_THRESHOLD) {
-      log.info("auto-compact triggered", { estimatedTokens, threshold: AUTO_COMPACT_TOKEN_THRESHOLD })
+    if (estimatedTokens > model.info.limit.context * AUTO_COMPACT_TOKEN_THRESHOLD_PERCENTAGE) {
+      log.info("auto-compact triggered", {
+        estimatedTokens,
+        threshold: model.info.limit.context * AUTO_COMPACT_TOKEN_THRESHOLD_PERCENTAGE,
+      })
       await summarize({
         sessionID: input.sessionID,
         providerID: input.providerID,

Jul 30 '25 11:07 BOTKooper

Claude code does autocompaction, and for the compaction prompt, it tells the model to be very specific about what the agent is doing, what the user asked for, and where the agentic flow was in completing that task, and it tells the model to move on, and it works fine most of the time. I think implementing it would make sense.

Jul 30 '25 11:07 Syazvinski

ok made those improvements, i think there might be some issues with images and such there though.

Most likely im going to abandon this at this point when i think its working well enough for my use case sry, so someone would have to pick this up from here and fixup :)

Jul 30 '25 22:07 lee101

why has this not been merged?

Sep 09 '25 17:09 ansh