serge New Model: Vicuna

Not the selfhosted task manager which you should also checkout that developer. This is a language model!

Trained using sharegpt data overlaying alpaca. Sharegpt has since taken down a lot of that data so this is kind of special. Can't put the toothpaste back in the tube right?

https://vicuna.lmsys.org/

Some info about it. https://youtu.be/4VByC2NpV30

Apr 04 '23 02:04 bradgillap

Vicuna-13B would be amazing. The model is performing similar to ChatGPT, just like how GPT4ALL is. It would be nice to test that one too.

Apr 04 '23 05:04 pale2hall

Someone posted this ~8GB 4-bit quantized model of vicuna-13b on twitter.

I haven't played with any other models locally, but this one seems to end every response with a new "### HUMAN" prompt (that is similar but not exactly what you asked) and a new "### RESPONSE". Maybe this is a result of its fine-tuning or something. Not sure if the other models do this, but anyway, it seems to work a'right on an AMD 5700G w/32G RAM. Also, the output formatted in Markdown doesn't always seem to look great in serge, especially with code blocks.

Just put the model in the weights folder and that's all that was needed. I can't vouch for it but it seems to work okay... the answers were coherent. Didn't play with it too much.

Apr 04 '23 07:04 fat-tire

Saw this one too which might be the same file.

https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-128g

Apr 07 '23 02:04 bradgillap

I think this indeed would be a good model to add, however I think it would be better if we either A. Have a box that let's us input a direct download link to download models or B. Make it so the list of models is just an array so that anybody can submit a pull-request to easily add more models.

Apr 07 '23 18:04 Stetsed

Just put the model in the weights folder and that's all that was needed. I can't vouch for it but it seems to work okay... the answers were coherent. Didn't play with it too much.

Maybe I got the wrong version but it's a .safetensors file and Serge doesn't seem to see it at all

Apr 13 '23 22:04 Reezlaw

I have done it. Using vicuna 1.0 and serge together. The weight is the existing weight from huggingface. The most annoying part is changing

if chat.questions != None: for question in chat.questions: if question.error != None: # skip errored out prompts continue prompt += "### Human:\n" + question.question + "\n" prompt += "### Assistant:\n" + question.answer + "\n"

Apr 14 '23 02:04 BenjiKCF

I have done it. Using vicuna 1.0 and serge together. The weight is the existing weight from huggingface. The most annoying part is changing

if chat.questions != None: for question in chat.questions: if question.error != None: # skip errored out prompts continue prompt += "### Human:\n" + question.question + "\n" prompt += "### Assistant:\n" + question.answer + "\n"

I was testing Vicuna (Legacy now) as well and noticed the ### HUMAN response. Maybe a stupid question but in what file do I need to change the configuration above?

Also is it possible to load a multi-bin model (such as Vicuna 1.1) in Serge?

Update, solved it via this vicuna 1.1 model, still would like to know though :)

Apr 14 '23 18:04 FumbleNL

I added support for open assistant & vicuna models to the serge downloader. Feel free to go to this issue: https://github.com/nsarrazin/serge/issues/217 if you're missing some models you'd like to see.

Apr 24 '23 23:04 nsarrazin

serge serge copied to clipboard

New Model: Vicuna

serge
serge copied to clipboard