[Feature] Remove defaults for model templates and system prompt
Feature Request
Remove defaults for model templates.
- System Prompt
- Chat Template
- Tool Calling
Add GUI warnings that they have to configure this in order to use the model...
Link to Wiki documentation explaining how to configure any sideloaded or "discovered" model.
The idea here is that the defaults we have for system prompt and chat template are in large part detrimental. Very few models will work well with these defaults.
Instead of parsing gguf files in order to install a new model and deduce the proper templates from that perhaps we should just require the user to fill out these templates by hand. If the templates aren't filled out, then the model might be 'installed' but it should not be 'enabled' or so on.
I wrote this for the situation.
https://github.com/nomic-ai/gpt4all/wiki/Configuring-Custom-Models
I believe this title is now concise and can be expected to remain unchanged.
@3Simplex Thanks for the guide. However the crucial part about finding the prompt is not really detailed.
Example of a "chat template" that might be found
"{% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{ content }}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}{% endif %}
The pseudo code
{% set loop_messages = messages %}
{% for message in loop_messages %}
{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}
{% if loop.index0 == 0 %}
{% set content = bos_token + content %}
{% endif %}
{{ content }}
{% endfor %}
{% if add_generation_prompt %}
{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}
{% endif %}
The prompt template I understand
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
%1<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
%2
Yet it fails miserabily with answer prompt looping.
@3Simplex Thanks for the guide. However the crucial part about finding the prompt is not really detailed. ...paraphrasing... Yet it fails miserably with answer prompt looping.
I see you are looking at a Jinja template. Thanks for reminding me, I just included that as an "Advanced Topic" along with how to make sure a model will work if it was not built correctly.
I can attempt to answer both of these questions here before I add them to the Wiki. (looks like you did well decoding the template)
Breaking down a Jinja template is fairly straight forward if you can follow a few rules.
You must keep the tokens as written in the jinja and strip out all of the other syntax etc. Also try to watch for mistakes here. Sometimes they fail to input a functional jinja template. The Jinja must have the following tokens:
- role beginning identifier tag
- role ending identifier tag
- roles
Sometimes they are combined into one like this <|user|> which indicates both a role and a beginning tag.
Let's start at the beginning of this Jinja.
> {% set loop_messages = messages %}
> {% for message in loop_messages %}
> {% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}
Most of this has to be removed because it's irrelevant to the LLM unless we get a Jinja parser from some nice contributor.
We keep this <|start_header_id|> as it states it is the starting header for the role.
We translate this + message['role'] + into the role to be used for the template.
You will have to figure out what the role names used by this model are, but these are the common ones.
Sometimes the roles will be shown in the Jinja sometimes it won't.
- system (if model supports a system prompt)
- look for something like "if role system"
- user or human (sometimes)
- assistant or model (sometimes)
We keep this <|end_header_id|>
We keep this \n\n which translates into one new line (press enter) for each \n you see. (two in this case)
Now we will translate message['content'] into the variable used by GPT4All.
%1for user messages%2for assistant replies
We keep this<|eot_id|>which indicates the end of whatever the role was doing.
Now we have our "content" from this jinja block. {% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %} and we removed all the extra stuff.
From what I can tell GPT4all sends the BOS automatically and waits for the LLM to send the EOS in return.
- BOS will tell the LLM where it begins generating a new message from. You can skip the BOS token.
- "content" is also sent automatically by GPT4all. You can skip this
content. (not to be confused withmessage['content'])
This whole section is not used by the GPT4All template.
{% if loop.index0 == 0 %}
{% set content = bos_token + content %}
{% endif %}
{{ content }}
{% endfor %}
Finally, we get to the part that shows a role defined for the "assistant". The way it is written implies the other one above is for either a system or user role. (Probably both because it would simply show "user" if it wasn't dual purpose.)
This is left open ended for the model to generate from this point on forward. As we can see from its absence the LLM is expected to provide an eos tag when it is done generating. Follow the same rules as we did above.
{% if add_generation_prompt %}
{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}
{% endif %}
This also provides us with an implied confirmation of how it should all look when it's done.
We will break this into two parts for GPT4All.
A System Prompt: (There is no variable you will just write what you want in it.)
<|start_header_id|>system<|end_header_id|>
YOUR CUSTOM SYSTEM PROMPT TEXT HERE<|eot_id|>
A Chat Template:
<|start_header_id|>user<|end_header_id|>
%1<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
%2
Tune in next time for the conclusion. (Why didn't it work? It looks like it's all good!)
Hint: You probably did it right and the model is not built properly. (I'll explain how to learn that too. After I eat.)
I'll have to conclude this tomorrow. Details for that will include. Where to look for these things. What to look for when you are looking at the files.
Thanks, would it help If I attach the .json files found along the model for you to see ? It looks to me they taken some personnal liberties with the tokens. I don't know I'm pretty new to the usage of GTP4All.
So basically if the model is not build correctly it's game over? However it seems that the collab they link do work.
Here is me saying hi :
I have finished explaining the advanced topics
- https://github.com/nomic-ai/gpt4all/wiki/Configuring-Custom-Models#advanced-topics
- https://github.com/nomic-ai/gpt4all/wiki/Configuring-Custom-Models#jinja2-explained
- expanded on this.
- https://github.com/nomic-ai/gpt4all/wiki/Configuring-Custom-Models#configuration-files-explained
- Created and expanded on this, which will help you figure out what is happening.
- You can talk with someone in real time here I'll probably be there.