Add Prompt Optimizer Feature to Reduce Token Usage
This request is based on a conversation I had in Discord where two users are hoping to have some kind of prompt optimizer similar to cursor.
Problem
Users with detailed, long prompts (especially when defining app requirements) can quickly exhaust their token budget, particularly when using token-intensive models like Claude Sonnet 4. While Goose has an auto-compact feature that kicks in at 80% context window usage, optimizing prompts before they're sent would help reduce token consumption from the start.
Proposed Solution
Add a built-in prompt optimizer feature to Goose that can rephrase long requirements and descriptions into more concise versions while preserving the essential information and intent. This would be similar to features found in tools like Trae/Cursor.
Use Case
When a user provides a lengthy initial prompt with detailed requirements, the optimizer could:
- Condense verbose descriptions while maintaining clarity
- Remove redundant information
- Restructure for token efficiency
- Preserve all critical technical details and requirements
Benefits
- Reduce token consumption, especially with expensive models
- Extend conversation length before hitting context limits
- Complement the existing auto-compact feature at 80% usage
- Help users get more value from their token budget
Additional Context
- Current workaround: Use lead/worker multi-model setup to delegate work to cheaper models
- Related docs:
- https://block.github.io/goose/blog/2025/08/18/understanding-context-windows
- https://block.github.io/goose/docs/tutorials/lead-worker/
- https://block.github.io/goose/docs/guides/multi-model/autopilot
Potential Implementation Ideas
- Optional pre-processing step for user prompts
- Integration with Claude or other LLMs for prompt refinement
- User toggle to enable/disable optimization
- Preview optimized prompt before sending
so fwiw auto compact kicks in at 80% but you can manually compact at any time you want.
I am not convinced we would want this as a separate feature though. as @Abhijay007 mentioned, the prompt refinement would still be done using an LLM, so you'd still pay for those tokens.
a workaround is just to have a conversation with goose about the thing you want to build and have it output an .md with that as a PRD and then chat with goose until you are happy. then start a second conversation and use that prd as the prompt. is that what we want, but then more elegant? if so, give that workflow a shot and let us know how we can do better than that. to me doing that and then hitting summarize and then saying, now let's do this for real would probably work well already
kk, I'll copy this to the folks in discord, so they can chime in. I shared a bunch of ways to manage context in a "smart" way including blog posts I've written.
I would like to pitch for a prompt enhacer similar to what roocode uses. he is a video that summaries it https://www.youtube.com/shorts/z0wEJngEe2Y. You will need to actually use to it get know its advantages. we could give use the ability to use a different cheaper weaker (free or locally) LLM for the summarirzation task. i would help in reducing the cost and providing more LLM freindly prompts to the LLM,
I think this would be a great tool to have within Goose as a relative tool for prompts you want to use in your own code, or if you want to optimise a recipe that runs often (i.e., a button within the recipe UI to optimise it).
A similar one exists here that I have used a few times: https://platform.openai.com/chat/edit?optimize=true