KoboldAI Max context lenght is broken

Max context lenght is broken

Open Hotohori opened this issue 1 year ago • 3 comments

I directly used AutoTokenizer and GPT2Tokenizer from transformers what also KAI uses (based on the source code) with from_pretrained() what also KAI uses and point it to the model I loaded.

I used the API function to create a /generate message. I used the give example text there and only copied it a lot of times so I get enough context length. So there is no escape sign or anything else in it that can make problems.

I added the line "max_context_length": 1500 (an different other number to be sure KAI use this parameter)

The result is always the same, no matter if I use AutoTokenizer and GPT2Tokenizer or any other Tokenizer I get out 1419 Tokens instead of 1500 at this example.

So it is clear KAI didn't cut on 1500 tokens where it should. I don't see where here could be an error on my test script:

token_count_test.py.txt

Apr 02 '23 16:04 Hotohori

This is not a bug, we subtract the amount to generate so it can keep generating so it has space to generate tokens while preserving stuff at the top like your memory / wi etc.

Apr 02 '23 16:04 henk717

Are that always be 81 token? Or from what (setting?) came this reserved space?

Apr 02 '23 16:04 Hotohori

The amount to generate setting which defaults to 80.

Apr 02 '23 16:04 henk717

KoboldAI KoboldAI copied to clipboard

Max context lenght is broken

KoboldAI
KoboldAI copied to clipboard