Feature Request: Add support for chatglm3 in example server.
Prerequisites
- [X] I am running the latest code. Mention the version if possible as well.
- [X] I carefully followed the README.md.
- [X] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [X] I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
ChatGLM3 uses a completely new prompt format. See https://github.com/THUDM/ChatGLM3/blob/main/PROMPT_en.md
I have created patch https://github.com/ggerganov/llama.cpp/commit/fd3492e85836c0df4b0404a47355159f4c349a44 for examples/server/public/prompt-formats.js
Motivation
Fixes chat errors, repetitions, and role reversals when playing with the example server.
Possible Implementation
Overall Structure
The format of the ChatGLM3 dialogue consists of several conversations, each of which contains a dialogue header and content. A typical multi-turn dialogue structure is as follows:
<|system|>
You are ChatGLM3, a large language model trained by Zhipu.AI. Follow the user's instructions carefully. Respond using markdown.
<|user|>
Hello
<|assistant|>
Hello, I'm ChatGLM3. What can I assist you today?
AFAIK support for gml3 and gml4 is already added: https://github.com/ggerganov/llama.cpp/pull/8031
Those are completely different files. That https://github.com/ggerganov/llama.cpp/pull/8031 was for the CLI version (which is also used/made into a server by some other projects like ollama). And the GGUF creation. This is for the gradio app server example that lets you choose a chat template when you run ./llama-server from the whisper.cpp github repo and navigate to http://localhost:port in the browser.
This issue was closed because it has been inactive for 14 days since being marked as stale.