serge icon indicating copy to clipboard operation
serge copied to clipboard

New Model: Vicuna

Open bradgillap opened this issue 1 year ago • 7 comments

Not the selfhosted task manager which you should also checkout that developer. This is a language model!

Trained using sharegpt data overlaying alpaca. Sharegpt has since taken down a lot of that data so this is kind of special. Can't put the toothpaste back in the tube right?

https://vicuna.lmsys.org/

Some info about it. https://youtu.be/4VByC2NpV30

bradgillap avatar Apr 04 '23 02:04 bradgillap

Vicuna-13B would be amazing. The model is performing similar to ChatGPT, just like how GPT4ALL is. It would be nice to test that one too.

pale2hall avatar Apr 04 '23 05:04 pale2hall

Someone posted this ~8GB 4-bit quantized model of vicuna-13b on twitter.

I haven't played with any other models locally, but this one seems to end every response with a new "### HUMAN" prompt (that is similar but not exactly what you asked) and a new "### RESPONSE". Maybe this is a result of its fine-tuning or something. Not sure if the other models do this, but anyway, it seems to work a'right on an AMD 5700G w/32G RAM. Also, the output formatted in Markdown doesn't always seem to look great in serge, especially with code blocks.

Just put the model in the weights folder and that's all that was needed. I can't vouch for it but it seems to work okay... the answers were coherent. Didn't play with it too much.

fat-tire avatar Apr 04 '23 07:04 fat-tire

Saw this one too which might be the same file.

https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-128g

bradgillap avatar Apr 07 '23 02:04 bradgillap

I think this indeed would be a good model to add, however I think it would be better if we either A. Have a box that let's us input a direct download link to download models or B. Make it so the list of models is just an array so that anybody can submit a pull-request to easily add more models.

Stetsed avatar Apr 07 '23 18:04 Stetsed

Just put the model in the weights folder and that's all that was needed. I can't vouch for it but it seems to work okay... the answers were coherent. Didn't play with it too much.

Maybe I got the wrong version but it's a .safetensors file and Serge doesn't seem to see it at all

Reezlaw avatar Apr 13 '23 22:04 Reezlaw

I have done it. Using vicuna 1.0 and serge together. The weight is the existing weight from huggingface. The most annoying part is changing

if chat.questions != None: for question in chat.questions: if question.error != None: # skip errored out prompts continue prompt += "### Human:\n" + question.question + "\n" prompt += "### Assistant:\n" + question.answer + "\n"

BenjiKCF avatar Apr 14 '23 02:04 BenjiKCF

I have done it. Using vicuna 1.0 and serge together. The weight is the existing weight from huggingface. The most annoying part is changing

if chat.questions != None: for question in chat.questions: if question.error != None: # skip errored out prompts continue prompt += "### Human:\n" + question.question + "\n" prompt += "### Assistant:\n" + question.answer + "\n"

I was testing Vicuna (Legacy now) as well and noticed the ### HUMAN response. Maybe a stupid question but in what file do I need to change the configuration above?

Also is it possible to load a multi-bin model (such as Vicuna 1.1) in Serge?

Update, solved it via this vicuna 1.1 model, still would like to know though :)

FumbleNL avatar Apr 14 '23 18:04 FumbleNL

I added support for open assistant & vicuna models to the serge downloader. Feel free to go to this issue: https://github.com/nsarrazin/serge/issues/217 if you're missing some models you'd like to see.

nsarrazin avatar Apr 24 '23 23:04 nsarrazin