KoboldAI
KoboldAI copied to clipboard
Max context lenght is broken
I directly used AutoTokenizer and GPT2Tokenizer from transformers what also KAI uses (based on the source code) with from_pretrained() what also KAI uses and point it to the model I loaded.
I used the API function to create a /generate message. I used the give example text there and only copied it a lot of times so I get enough context length. So there is no escape sign or anything else in it that can make problems.
I added the line "max_context_length": 1500 (an different other number to be sure KAI use this parameter)
The result is always the same, no matter if I use AutoTokenizer and GPT2Tokenizer or any other Tokenizer I get out 1419 Tokens instead of 1500 at this example.
So it is clear KAI didn't cut on 1500 tokens where it should. I don't see where here could be an error on my test script:
This is not a bug, we subtract the amount to generate so it can keep generating so it has space to generate tokens while preserving stuff at the top like your memory / wi etc.
Are that always be 81 token? Or from what (setting?) came this reserved space?
The amount to generate setting which defaults to 80.