Prompt for conversation title is not enough for small models
Describe the bug Prompt for conversation title is sometimes not enough for small models.
Expected behavior
With a small model, it should just write a single title.
Screenshots
Debugging information I'm using gemma3 1b for title generation. I am on the latest Alpaca version on flatpak.
Possible solutions Either make the prompt better, ignore this issue or make the title prompt customizable. Honestly I don't know enough about it to say :p
Thanks for your bug report!
Unfortunately, as the saying goes, "you can't fix stupid" - some models are just too small to understand their task correctly. Alpaca already uses structured outputs to make sure models really, really just answer with a title - but some slip and mess up anyway.
Especially smaller models like SmolLM2 135M, 360M etc. will probably not be able to get this right in one shot for months or even years of development to come. Thus, I've prepared a pull request (#870) that attempts to simply cap the maximum title length to 30 characters. That should at least stop titles from becoming obscenely long, but obviously won't improve the output quality of these models.
Hope that mitigates your issue (at least for a little bit)!
Have a great day.
@azomDev @mags0ft
I have a theory why some models may have an issue with this.
Titles used to be generated through the api/generate endpoint. If you pass a system prompt through this endpoint, this gets attached to the model's original system prompt. The instruction to the model (defined in a constant, something like "you're a helpful assistant that creates short titles" is passed as a system prompt, so it gets appended to the end of the existing system prompt, where some models may be inclined to ignore it if it interferes with some other directives.
Based on my suggestion, jeff is switching to api/chat endpoint to speed up generation. This works differently, if you pass a system prompt, it replaces the model's system prompt.
So with this approach:
- pros: model is more likely to obey the instruction, since it's literally the only one it has
- cons: the title won't reflect the model's personality
I already proposed a third solution, pass the title instruction as part of the user prompt (only for title generation, obv). This seems to work just as well in my tests, and may fix this issue too, since most small models are actually more likely to pay attention to the beginning of the user prompt than the end of the system prompt. If jeff accepts my PR #947 this might work better. We'll see. But hard limiting the output through max_tokens was definitely the right idea too.