serge
serge copied to clipboard
New Model: Vicuna
Not the selfhosted task manager which you should also checkout that developer. This is a language model!
Trained using sharegpt data overlaying alpaca. Sharegpt has since taken down a lot of that data so this is kind of special. Can't put the toothpaste back in the tube right?
https://vicuna.lmsys.org/
Some info about it. https://youtu.be/4VByC2NpV30
Vicuna-13B would be amazing. The model is performing similar to ChatGPT, just like how GPT4ALL is. It would be nice to test that one too.
Someone posted this ~8GB 4-bit quantized model of vicuna-13b on twitter.
I haven't played with any other models locally, but this one seems to end every response with a new "### HUMAN" prompt (that is similar but not exactly what you asked) and a new "### RESPONSE". Maybe this is a result of its fine-tuning or something. Not sure if the other models do this, but anyway, it seems to work a'right on an AMD 5700G w/32G RAM. Also, the output formatted in Markdown doesn't always seem to look great in serge, especially with code blocks.
Just put the model in the weights folder and that's all that was needed. I can't vouch for it but it seems to work okay... the answers were coherent. Didn't play with it too much.
Saw this one too which might be the same file.
https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-128g
I think this indeed would be a good model to add, however I think it would be better if we either A. Have a box that let's us input a direct download link to download models or B. Make it so the list of models is just an array so that anybody can submit a pull-request to easily add more models.
Just put the model in the weights folder and that's all that was needed. I can't vouch for it but it seems to work okay... the answers were coherent. Didn't play with it too much.
Maybe I got the wrong version but it's a .safetensors file and Serge doesn't seem to see it at all
I have done it. Using vicuna 1.0 and serge together. The weight is the existing weight from huggingface. The most annoying part is changing
if chat.questions != None: for question in chat.questions: if question.error != None: # skip errored out prompts continue prompt += "### Human:\n" + question.question + "\n" prompt += "### Assistant:\n" + question.answer + "\n"
I have done it. Using vicuna 1.0 and serge together. The weight is the existing weight from huggingface. The most annoying part is changing
if chat.questions != None: for question in chat.questions: if question.error != None: # skip errored out prompts continue prompt += "### Human:\n" + question.question + "\n" prompt += "### Assistant:\n" + question.answer + "\n"
I was testing Vicuna (Legacy now) as well and noticed the ### HUMAN response. Maybe a stupid question but in what file do I need to change the configuration above?
Also is it possible to load a multi-bin model (such as Vicuna 1.1) in Serge?
Update, solved it via this vicuna 1.1 model, still would like to know though :)
I added support for open assistant & vicuna models to the serge downloader. Feel free to go to this issue: https://github.com/nsarrazin/serge/issues/217 if you're missing some models you'd like to see.