llama
llama copied to clipboard
Making it continue for more tokens?
Bit of a dumb question probably, but what is the best way to make it continue for, say, another 256 tokens?
Say your prompt is 30 tokens. And your output is 100 tokens. Do you just feed that prompt of 130 tokens back in again? And then repeat?
I know if you tried to write a book with this it wouldn't do very well because it would forget what it wrote at the start of the book.
(However one way round that which would work with ChatGPT would be to ask it to "summarise the previous text" and add that summary to your prompt to continue writing the novel, so that it would keep a summary of the novel in its memory but maybe forget specific details)
I'll try max_length=1500,
Try https://github.com/oobabooga/text-generation-webui it has such a feature. But yes beyond the max of 2048 you need to find workarounds.