web-llm
web-llm copied to clipboard
Question: Persisted chat
Is restoring context / previous messages / responses supported? If so, how would one go about this? I noticed on the worker API there is a ChatModule.prefill method but I'm unsure what prefill means in this context and if it's even related, but if it is I'm also unsure of where to find the formats for this string the different models would expect as input.
Honestly any insight or resources would be amazing. This lib is awesome and I was able to piece together a working chat in a matter of hours, and if being able to add a concept of memory already exists, that would be even more amazing.
its not supported. im trying to build it now. not too hard, but really have to mess with the the llama pipeline. you'll pay a cost with a lot of additional input tokenization at load, vs just appending tokens as you go. but the benefit is an "openai-like" pipeline that people already know how to work with.
got it to work, put up a patch. not that hard. but i had to access the private pipeline object to stuff things in the conversation.
this is now supported through full OAI API