Qwen2.5
Qwen2.5 copied to clipboard
Qwen1.5 系列模型相较于Qwen1.0 从model.chat 变成了model.generate,那之前的history参数呢?我部署到本地的Qwen1.5-7B-chat都不能实现交互式回答问题,会遗忘掉之前的对话记录
Qwen1.5 系列模型相较于Qwen1.0 从model.chat 变成了model.generate,那之前的history参数呢?我部署到本地的Qwen1.5-7B-chat都不能实现交互式回答问题,会遗忘掉之前的对话记录
Please note that there is a misunderstanding about the operation of Qwen models. The model.chat
function is not a standard API provided by the transformers
library. Instead, it formats multiple messages into a template recognizable by the chat model. A comparable functionality has been implemented in the tokenizer classes through the apply_chat_template
method. Relevant examples can be found in the README file. Kindly review the README for more information.
请注意,关于Qwen模型的工作方式存在一个误区。model.chat
函数并非transformers
库提供的标准API。它的功能是将多条消息格式化为聊天模型能够理解的模板。类似的功能已在tokenizer类中通过apply_chat_template
方法实现。相关的示例在README文件中有提供,请参阅README以获取更多信息。
请参考此代码,推荐参考文档使用 vllm、llama.cpp、ollama 等方式起一个 OpenAI API 格式的服务,再使用此代码。
非 OpenAI API 格式的代码同理。
from openai import OpenAI
client = OpenAI(base_url="http://your-qwen-api-server:8000/v1", api_key="test") # 必须填一个 api_key,否则报错
def main():
messages = []
while True:
messages.append({"role": "user", "content": input(("\n" * 2 if messages else "") + "user: ")})
stream = client.chat.completions.create(model="", messages=messages, stream=True)
content = ''
for chunk in stream:
delta = chunk.choices[0].delta.content or ""
print(delta, end="", flush=True)
content += delta
messages.append({"role": "assistant", "content": content})
if __name__ == '__main__':
main()