LibreChat
LibreChat copied to clipboard
Enhancement: Dynamic Context Token Size for OpenRouter LLM
What is your question?
Hi,
I'm testing codellama/codellama-70b-instruct LLM from OpenRouter. It's limited to a context length of 2,048 tokens. But it's capable for unlimited context length.
My questions are:
- How do I set longer context length. e.g. 128k.
- How do I disable the [middle-out transform](middle-out transform)?
More Details
Link to LLM on OpenRouter: https://openrouter.ai/models/codellama/codellama-70b-instruct
Thank you for your help in advance, Danny
What is the main subject of your question?
No response
Screenshots
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
You can't, but OpenRouter does return context token size when fetching models. I've been thinking on how to use this and will have an update on this soon
So I couldn't find a way to "disable" middle-out in their docs, but if you do "infinite" context, then it implies middle-out is 'activated', so my assumption is that "disabling" is abiding to the 2048 context of that model.
My upcoming update will use their rates, which may not be satisfactory to you for this particular model since it only has 2048 context. I think this is fine though, since they are discarding your tokens anyway (it just won't appear that way)
Hi, I've found this in the documentation. You have the disable transforms: ["middle-out"] to transforms: [""]?
middle-out: compress prompts and message chains to the context size. This helps users extend conversations in part because [LLMs pay significantly less attention](https://twitter.com/xanderatallah/status/1678511019834896386)
to the middle of sequences anyway. Works by compressing or removing messages in the middle of the prompt.
Note: [All OpenRouter models](https://openrouter.ai/models) default to using middle-out, unless you exclude this transform by e.g. setting transforms: [] in the request body.
https://openrouter.ai/docs#errors
LibreChat is already doing what OpenRouter would do if you exceed the context though.
it's capable for unlimited context length.
this is not really true. They just discard your tokens with the “middle-out” strategy if you go over. Disabling it would run you into context length error. LibreChat discards tokens that would exceed the context, starting from the tail end of the conversation