Slow output due to regeneration of input prompt

Open PranavSudersan opened this issue 2 years ago • 1 comments

To give an example, if I input a chunk of paragraphs as input prompt with the instruction to summarize the text, the ouput response appears to be regenerating the input prompt from the start rather than straightaway continuing it (see reference video). This seems to be an unusual behavior, when compared to my experience with using other language model UIs. (e.g. oogabooga)

Can this be improved?

https://user-images.githubusercontent.com/62466671/228490563-b6e92c44-d0a2-4f89-adc5-adb1e060a8c2.mp4

Mar 29 '23 09:03 PranavSudersan

I also noticed this. Perhaps that is how these models work but this part of e.g. chatGPT is doing this way faster due to the sheer scale of the servers used?

Apr 04 '23 08:04 64jcl