LibreChat icon indicating copy to clipboard operation
LibreChat copied to clipboard

Enhancement: Dynamic Context Token Size for OpenRouter LLM

Open dannykorpan opened this issue 1 year ago • 2 comments

What is your question?

Hi,

I'm testing codellama/codellama-70b-instruct LLM from OpenRouter. It's limited to a context length of 2,048 tokens. But it's capable for unlimited context length.

My questions are:

  1. How do I set longer context length. e.g. 128k.
  2. How do I disable the [middle-out transform](middle-out transform)?

More Details

Link to LLM on OpenRouter: https://openrouter.ai/models/codellama/codellama-70b-instruct

Thank you for your help in advance, Danny

What is the main subject of your question?

No response

Screenshots

grafik

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

dannykorpan avatar Feb 01 '24 12:02 dannykorpan

You can't, but OpenRouter does return context token size when fetching models. I've been thinking on how to use this and will have an update on this soon

danny-avila avatar Feb 01 '24 15:02 danny-avila

301499842-f33e24f9-bfe3-4910-92af-dc658cf1b4d2

So I couldn't find a way to "disable" middle-out in their docs, but if you do "infinite" context, then it implies middle-out is 'activated', so my assumption is that "disabling" is abiding to the 2048 context of that model.

My upcoming update will use their rates, which may not be satisfactory to you for this particular model since it only has 2048 context. I think this is fine though, since they are discarding your tokens anyway (it just won't appear that way)

danny-avila avatar Feb 01 '24 20:02 danny-avila

Hi, I've found this in the documentation. You have the disable transforms: ["middle-out"] to transforms: [""]?

middle-out: compress prompts and message chains to the context size. This helps users extend conversations in part because [LLMs pay significantly less attention](https://twitter.com/xanderatallah/status/1678511019834896386)

    to the middle of sequences anyway. Works by compressing or removing messages in the middle of the prompt.

Note: [All OpenRouter models](https://openrouter.ai/models) default to using middle-out, unless you exclude this transform by e.g. setting transforms: [] in the request body.
grafik

https://openrouter.ai/docs#errors

dannykorpan avatar Feb 02 '24 08:02 dannykorpan

LibreChat is already doing what OpenRouter would do if you exceed the context though.

it's capable for unlimited context length.

this is not really true. They just discard your tokens with the “middle-out” strategy if you go over. Disabling it would run you into context length error. LibreChat discards tokens that would exceed the context, starting from the tail end of the conversation

danny-avila avatar Feb 02 '24 08:02 danny-avila