Allow dynamic removal of history elements to manage token limit
Is your feature request related to a problem? Please describe. I'm always frustrated when using this library and encountering the issue of reaching the maximum token limit with gpt-3.5-turbo. The limitation on token count often restricts the amount of historical context I can include, hindering the effectiveness of the generated responses.
Describe the solution you'd like I want to be able to remove certain elements of the history dynamically, allowing me to control the token count and prevent reaching the maximum limit.
Describe alternatives you've considered An alternative I've considered is switching to gpt-4 instead of gpt-3.5-turbo. GPT-4 offers a higher maximum token limit, which would alleviate the issue of reaching the token constraint more frequently. However, GPT-4 comes with a higher cost than GPT-3.5-turbo, which may only be feasible for some users and projects.
Additional context GPT-3.5-turbo has a maximum token limit of 4096 tokens ($0.002 / 1K tokens). Upgrading to gpt-4 offers a higher token limit (8000 tokens) but comes at a higher cost of approximately $0.03 / 1K tokens for prompts and $0.06 / 1K tokens for completions.