FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

FastChat-T5 4K context

Open tutankhamen-1 opened this issue 2 years ago • 5 comments

lmsys.org states that FastChat-T5 supports a context size of 4K. How do I get it to work? I get an error as soon as I go above 2K.

tutankhamen-1 avatar Jun 15 '23 21:06 tutankhamen-1

It can encode 2K tokens, and output 2K tokens, a total of 4K tokens. But it cannot take in 4K tokens along. @tutankhamen-1. In contrast, Llama-like model encode+output 2K tokens.

DachengLi1 avatar Jun 15 '23 23:06 DachengLi1

It can encode 2K tokens, and output 2K tokens, a total of 4K tokens. But it cannot take in 4K tokens along. @tutankhamen-1. In contrast, Llama-like model encode+output 2K tokens.

That’s great, but the 2K total limit seems to be hardcoded in many places and I can’t get it to work. I’m trying to use it through the API.

tutankhamen-1 avatar Jun 16 '23 06:06 tutankhamen-1

The current behavior should be correct? It can only encode 2K tokens, which is the hardcoded places you see. But it can output another 2K tokens. If you use Llama (vicuna), it can encode 2K tokens, but when you give 2K tokens to it, it cannot output anything.

DachengLi1 avatar Jun 16 '23 15:06 DachengLi1

This is the error message I get:

This model's maximum context length is 2048 tokens. However, you requested 2302 tokens (1790 in the messages, 512 in the completion). Please reduce the length of the messages or completion.

Model: fastchat-t5-3b-v1.0

tutankhamen-1 avatar Jun 16 '23 15:06 tutankhamen-1

@tutankhamen-1 Thanks for letting us know! We will fix it. @merrymercy Let's change the error message for T5?

DachengLi1 avatar Jun 16 '23 15:06 DachengLi1

Isn't this limit arbitrary for t5 due to its attention mechanism? My understanding is that it uses quadratic memory as the context length goes up, but as long as you have the ram to support it, a longer context length isn't limited by the model itself.

Taytay avatar Jun 30 '23 18:06 Taytay

@tutankhamen-1 Could you help us fix the bug and contribute a pull request?

merrymercy avatar Jul 05 '23 09:07 merrymercy