web-llm Question: Persisted chat

Question: Persisted chat

Open jacob-ebey opened this issue 2 years ago • 2 comments

Is restoring context / previous messages / responses supported? If so, how would one go about this? I noticed on the worker API there is a ChatModule.prefill method but I'm unsure what prefill means in this context and if it's even related, but if it is I'm also unsure of where to find the formats for this string the different models would expect as input.

Honestly any insight or resources would be amazing. This lib is awesome and I was able to piece together a working chat in a matter of hours, and if being able to add a concept of memory already exists, that would be even more amazing.

Jul 23 '23 02:07 jacob-ebey

its not supported. im trying to build it now. not too hard, but really have to mess with the the llama pipeline. you'll pay a cost with a lot of additional input tokenization at load, vs just appending tokens as you go. but the benefit is an "openai-like" pipeline that people already know how to work with.

Sep 20 '23 14:09 earonesty

got it to work, put up a patch. not that hard. but i had to access the private pipeline object to stuff things in the conversation.

Sep 20 '23 17:09 earonesty

this is now supported through full OAI API

May 31 '24 21:05 tqchen

web-llm web-llm copied to clipboard

Question: Persisted chat

web-llm
web-llm copied to clipboard