big-AGI
big-AGI copied to clipboard
[Roadmap] Context token Control
Why API usage is paid for by tokens and is stateless. Currently there is only control for output size and the conversation is sent for context as input. On a 32k token model the costs increase fast when you just need it to remember the last chat such as when using it for code development.
Concise description When using API calls to Mistral, Google, or OpenAI I can control the output context but not the input. Checking my token usage as the conversation grows there is an increase in cost per message due to the input tokens becoming the size of the max token. (I.E. 28K input for a 4K output)
Requirements Add another slider for input tokens next to the output token slider that determines how much context to send in the API call. It can default to max for new users, but more advanced users can adjust it to reduce their API bills.