text-generation-webui Support for MPT, INCITE, WizardLM, StableLM, Galactica, Vicuna, Guanaco, and Baize instruction following

This fixes Issue https://github.com/oobabooga/text-generation-webui/issues/1383

Adds characters and prompts for MPT, INCITE, WizardLM, StableLM, Galactica, Vicuna, Guanaco, and Baize.
Uses the Alpaca format for LaMini models
Adds presets for StableLM.
Fixes incorrect context for Vicuna.
Fixes incorrect version name for Vicuna 1.1.
Automatically detects Vicuna version based on filename.
Improves model type detection in load_quantized.

Notes:

~Vicuna versions are not automatically detected yet.~
Galactica supports many different prompt formats, I only added the main ones.
This is not fully tested (I only have a 4GB graphics card).
StableLM is a bad model, so getting bad results doesn't mean it isn't working.

Apr 27 '23 06:04 CarlKenner

@oobabooga can we choose what instruction format to run? Because I'm currently training a Vicuna LoRA and I could not choose to use it's format unless I renamed the base model from llama-13b to Vicuna-13b.

Edit: mybad I was wrong

Apr 27 '23 15:04 USBhost

You can create a rule for it under models/config-user.yaml like

my-model-llama-13b:
  mode: 'instruct'
  instruction_template: 'Vicuna'

Apr 27 '23 16:04 oobabooga

@oobabooga can we choose what instruction format to run?

Yes, just scroll down on the chat page and set the mode to instruct and then there should be a dropdown box for choosing which instruction format to use.

Unless by "run" you mean doing the actual training, in which case I have no idea.

Because I'm currently training a Vicuna LoRA and I could not choose to use it's format unless I renamed the base model from llama-13b to Vicuna-13b.

Vicuna is actually two completely different instruction formats. Vicuna v0 (which uses ### and calls you a human) and Vicuna v1.1 (which calls you a user). Oobabooga can't detect which it is yet, and I think it defaults to Vicuna v0 if you include vicuna in the name somewhere. Make sure you know, and let your users know, which Vicuna instruction format version your model or LoRA was trained for.

I haven't tried LoRAs yet, so I don't know how (or if) they're detected. Currently, I'm just adding basic support to the latest instruction formats while they're still a hot topic and people are downloading them and trying them in Oobabooga. But they keep not getting pulled, so people keep trying the latest models in Oobabooga assuming that they are supported, and getting bad results. Especially when the influencers are making videos about using them in Oobabooga and telling people they work. Adding instruction following support to models that are in the news is extremely time-sensitive. I'll work on the more advanced stuff like LoRAs later.

Apr 28 '23 03:04 CarlKenner

@CarlKenner thanks for the info yeah I see that now. idk why I thought that was blank... oh well. Sorry for the confusion.

Apr 28 '23 03:04 USBhost

It should (in theory) now automatically detect Vicuna v0 and Vicuna v1.1 models based on the filenames.

Apr 28 '23 06:04 CarlKenner

Are you sure the WizardLM instruction following is correct? It seems to start talking to itself after a while and almost completely ignoring me. It takes the role of the human as well and then keeps going back and forth answering like a human and like an AI, but also ignoring what I said.

Also a syntax error at line 138 in GPTQ_loader.py

Maybe it's the "You" that is added for the user's name. But also removing it makes it so it cannot continue from a newline. Putting \n in the user's name fixes that. But I'm still not sure if the selftalking is fixed by that.

Apr 28 '23 17:04 LaaZa

Are you sure the WizardLM instruction following is correct?

No. I'm not sure.

Maybe it's the "You" that is added for the user's name.

I don't think it should be adding anything for the user's name. I'll have to check what's happening.

after a while

Do you mean in follow-up questions? WizardLM does not have a concept of chat history with multiple questions. There's nothing to tag the start of a question.

Also a syntax error at line 138 in GPTQ_loader.py

Oops. Sorry. I'd better fix that.

Thank you so much for the bug reports.

Apr 29 '23 10:04 CarlKenner

after a while

Do you mean in follow-up questions? WizardLM does not have a concept of chat history with multiple questions. There's nothing to tag the start of a question.

Ooh, that would make sense. But I think the upside of that is that it is actually really good for other formats like roleplay. If you use character names for dialogue it actually works really well unlike most of the other instruct models where they sometimes try to be the assistant again.

Apr 29 '23 18:04 LaaZa

I want to merge this PR asap, but I have to test it first.

May 02 '23 03:05 oobabooga

I want to merge this PR asap, but I have to test it first.

Very wise. I struggle to test things because I only have a 4GB GTX 970, so it's not all tested.

~You might want to pull my other pull request before testing this, because the verbose mode currently doesn't show the special tokens in the input even if they are present and correct.~

May 02 '23 09:05 CarlKenner

I will take infinitely long to review this PR, so I will yolo it and make the necessary edits later if any.

May 09 '23 23:05 oobabooga

text-generation-webui text-generation-webui copied to clipboard

Support for MPT, INCITE, WizardLM, StableLM, Galactica, Vicuna, Guanaco, and Baize instruction following

text-generation-webui
text-generation-webui copied to clipboard