Qwen2.5 icon indicating copy to clipboard operation
Qwen2.5 copied to clipboard

Qwen1.5 系列模型相较于Qwen1.0 从model.chat 变成了model.generate,那之前的history参数呢?我部署到本地的Qwen1.5-7B-chat都不能实现交互式回答问题,会遗忘掉之前的对话记录

Open chengxiang123aa opened this issue 11 months ago • 2 comments

Qwen1.5 系列模型相较于Qwen1.0 从model.chat 变成了model.generate,那之前的history参数呢?我部署到本地的Qwen1.5-7B-chat都不能实现交互式回答问题,会遗忘掉之前的对话记录

chengxiang123aa avatar Mar 06 '24 10:03 chengxiang123aa

Please note that there is a misunderstanding about the operation of Qwen models. The model.chat function is not a standard API provided by the transformers library. Instead, it formats multiple messages into a template recognizable by the chat model. A comparable functionality has been implemented in the tokenizer classes through the apply_chat_template method. Relevant examples can be found in the README file. Kindly review the README for more information.

请注意,关于Qwen模型的工作方式存在一个误区。model.chat函数并非transformers库提供的标准API。它的功能是将多条消息格式化为聊天模型能够理解的模板。类似的功能已在tokenizer类中通过apply_chat_template方法实现。相关的示例在README文件中有提供,请参阅README以获取更多信息。

jklj077 avatar Mar 11 '24 13:03 jklj077

请参考此代码,推荐参考文档使用 vllm、llama.cpp、ollama 等方式起一个 OpenAI API 格式的服务,再使用此代码。

非 OpenAI API 格式的代码同理。

from openai import OpenAI

client = OpenAI(base_url="http://your-qwen-api-server:8000/v1", api_key="test")  # 必须填一个 api_key,否则报错


def main():
    messages = []
    while True:
        messages.append({"role": "user", "content": input(("\n" * 2 if messages else "") + "user: ")})
        stream = client.chat.completions.create(model="", messages=messages, stream=True)
        content = ''
        for chunk in stream:
            delta = chunk.choices[0].delta.content or ""
            print(delta, end="", flush=True)
            content += delta
        messages.append({"role": "assistant", "content": content})


if __name__ == '__main__':
    main()

LucienShui avatar Mar 11 '24 16:03 LucienShui