Qwen-Agent
Qwen-Agent copied to clipboard
workstation的editor提交数据到错误的URL,造成http 404错误
在ollama中跑qwen2/Qwen2-beta-4B-Chat 模型,启动browserqwen后,workstation页面的editor会提交数据到错误的URL,ollama的API文档里提到completion的URL对应是 "/api/chat" (参见:https://github.com/ollama/ollama/blob/main/docs/api.md) ,但是点击“continue”按钮时,可以看到数据被提交到了"/api/chat/completions",从而造成http 404错误,browserqwen端报错信息 Traceback (most recent call last): File "/home/kevin/Qwen-Agent/qwen_server/workstation_server.py", line 387, in generate for chunk in output_beautify.convert_to_full_str_stream(response): File "/home/kevin/Qwen-Agent/qwen_server/output_beautify.py", line 85, in convert_to_full_str_stream for message_list in message_list_stream: File "/home/kevin/Qwen-Agent/qwen_agent/agent.py", line 65, in run for rsp in self._run(messages=new_messages, **kwargs): File "/home/kevin/Qwen-Agent/qwen_agent/agents/article_agent.py", line 37, in _run for trunk in res: File "/home/kevin/Qwen-Agent/qwen_agent/agent.py", line 65, in run for rsp in self._run(messages=new_messages, **kwargs): File "/home/kevin/Qwen-Agent/qwen_agent/prompts/write_from_scratch.py", line 52, in _run for trunk in res_sum: File "/home/kevin/Qwen-Agent/qwen_agent/agent.py", line 65, in run for rsp in self._run(messages=new_messages, **kwargs): File "/home/kevin/Qwen-Agent/qwen_agent/llm/base.py", line 423, in _convert_messages_iterator_to_target_type for messages in messages_iter: File "/home/kevin/Qwen-Agent/qwen_agent/llm/oai.py", line 60, in _chat_stream response = self._chat_complete_create(model=self.model, File "/home/kevin/Qwen-Agent/qwen_agent/llm/oai.py", line 49, in _chat_complete_create return client.chat.completions.create(*args, **kwargs) File "/home/kevin/.local/lib/python3.8/site-packages/openai/_utils/_utils.py", line 275, in wrapper return func(*args, **kwargs) File "/home/kevin/.local/lib/python3.8/site-packages/openai/resources/chat/completions.py", line 663, in create return self._post( File "/home/kevin/.local/lib/python3.8/site-packages/openai/_base_client.py", line 1200, in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) File "/home/kevin/.local/lib/python3.8/site-packages/openai/_base_client.py", line 889, in request return self._request( File "/home/kevin/.local/lib/python3.8/site-packages/openai/_base_client.py", line 980, in _request raise self._make_status_error_from_response(err.response) from None openai.NotFoundError: 404 page not found
Ollama服务器端的错误信息: llama_new_context_with_model: graph splits (measure): 1 time=2024-03-01T18:23:49.073+08:00 level=INFO source=dyn_ext_server.go:161 msg="Starting llama main loop" [GIN] 2024/03/01 - 18:23:49 | 200 | 5.4693876s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/03/01 - 18:25:59 | 404 | 2.7µs | 127.0.0.1 | POST "/api/chat/completions"
如何修复? 能否改为可配置completion URL呢?
根据 https://github.com/ollama/ollama/blob/main/docs/openai.md 这个文档,api_base(也叫 base_url)应该填类似 http://localhost:11434/v1/
的值 —— 请注意有 /v1/
这个后缀,这个后缀可能在您给的那个 api.md 里并没有提到。
注:我正在安装 ollama 进行尝试,网速有点慢。。
我的ollama实例启动log显示base url没有/v1/,我用以下命令启动browserqwen还是报404错误: python3 run_server.py --llm qwen2/Qwen2-beta-4B-Chat --model_server http://127.0.0.1:11434/v1/api
ollama端的日志: llama_new_context_with_model: graph splits (measure): 1 time=2024-03-04T11:34:04.291+08:00 level=INFO source=dyn_ext_server.go:161 msg="Starting llama main loop" [GIN] 2024/03/04 - 11:34:04 | 200 | 4.3672342s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/03/04 - 11:35:33 | 200 | 3.4914991s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/03/04 - 11:38:33 | 404 | 2.9µs | 127.0.0.1 | POST "/v1/api/chat/completions"
所以我觉得应该参考他的endpoint API文档:https://github.com/ollama/ollama/blob/main/docs/api.md 。我也是新手,不知道是否可以在ollama在把qwen启动为兼容openai的API模式。
我的ollama实例启动log显示base url没有/v1/,我用以下命令启动browserqwen还是报404错误: python3 run_server.py --llm qwen2/Qwen2-beta-4B-Chat --model_server http://127.0.0.1:11434/v1/api
ollama端的日志: llama_new_context_with_model: graph splits (measure): 1 time=2024-03-04T11:34:04.291+08:00 level=INFO source=dyn_ext_server.go:161 msg="Starting llama main loop" [GIN] 2024/03/04 - 11:34:04 | 200 | 4.3672342s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/03/04 - 11:35:33 | 200 | 3.4914991s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/03/04 - 11:38:33 | 404 | 2.9µs | 127.0.0.1 | POST "/v1/api/chat/completions"
所以我觉得应该参考他的endpoint API文档:https://github.com/ollama/ollama/blob/main/docs/api.md 。我也是新手,不知道是否可以在ollama在把qwen启动为兼容openai的API模式。
试试 --model_server http://127.0.0.1:11434/v1
?注意以 v1 结尾,不是以 v1/api 结尾
用 --model_server http://127.0.0.1:11434/v1 参数还是同样的404错误,启动命令为: python3 run_server.py --llm qwen2/Qwen2-beta-4B-Chat --model_server http://127.0.0.1:11434/v1
在ollama server端看到的错误: llama_new_context_with_model: graph splits (measure): 1 time=2024-03-06T15:13:57.787+08:00 level=INFO source=dyn_ext_server.go:161 msg="Starting llama main loop" [GIN] 2024/03/06 - 15:13:57 | 200 | 4.8649114s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/03/06 - 15:14:23 | 200 | 10.8918156s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/03/06 - 15:15:38 | 404 | 303.5µs | 127.0.0.1 | POST "/v1/chat/completions"
在workstation的错误信息: Traceback (most recent call last): File "/home/kevin/Qwen-Agent/qwen_server/workstation_server.py", line 387, in generate for chunk in output_beautify.convert_to_full_str_stream(response): File "/home/kevin/Qwen-Agent/qwen_server/output_beautify.py", line 85, in convert_to_full_str_stream for message_list in message_list_stream: File "/home/kevin/Qwen-Agent/qwen_agent/agent.py", line 65, in run for rsp in self._run(messages=new_messages, **kwargs): File "/home/kevin/Qwen-Agent/qwen_agent/agents/article_agent.py", line 37, in _run for trunk in res: File "/home/kevin/Qwen-Agent/qwen_agent/agent.py", line 65, in run for rsp in self._run(messages=new_messages, **kwargs): File "/home/kevin/Qwen-Agent/qwen_agent/prompts/write_from_scratch.py", line 52, in _run for trunk in res_sum: File "/home/kevin/Qwen-Agent/qwen_agent/agent.py", line 65, in run for rsp in self._run(messages=new_messages, **kwargs): File "/home/kevin/Qwen-Agent/qwen_agent/llm/base.py", line 423, in _convert_messages_iterator_to_target_type for messages in messages_iter: File "/home/kevin/Qwen-Agent/qwen_agent/llm/oai.py", line 60, in _chat_stream response = self._chat_complete_create(model=self.model, File "/home/kevin/Qwen-Agent/qwen_agent/llm/oai.py", line 49, in _chat_complete_create return client.chat.completions.create(*args, **kwargs) File "/home/kevin/.local/lib/python3.8/site-packages/openai/_utils/_utils.py", line 275, in wrapper return func(*args, **kwargs) File "/home/kevin/.local/lib/python3.8/site-packages/openai/resources/chat/completions.py", line 663, in create return self._post( File "/home/kevin/.local/lib/python3.8/site-packages/openai/_base_client.py", line 1200, in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) File "/home/kevin/.local/lib/python3.8/site-packages/openai/_base_client.py", line 889, in request return self._request( File "/home/kevin/.local/lib/python3.8/site-packages/openai/_base_client.py", line 980, in _request raise self._make_status_error_from_response(err.response) from None openai.NotFoundError: Error code: 404 - {'error': {'message': "model 'qwen2/Qwen2-beta-4B-Chat' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/kevin/.local/lib/python3.8/site-packages/gradio/queueing.py", line 407, in call_prediction output = await route_utils.call_process_api( File "/home/kevin/.local/lib/python3.8/site-packages/gradio/route_utils.py", line 226, in call_process_api output = await app.get_blocks().process_api( File "/home/kevin/.local/lib/python3.8/site-packages/gradio/blocks.py", line 1550, in process_api result = await self.call_function( File "/home/kevin/.local/lib/python3.8/site-packages/gradio/blocks.py", line 1199, in call_function prediction = await utils.async_iteration(iterator) File "/home/kevin/.local/lib/python3.8/site-packages/gradio/utils.py", line 519, in async_iteration return await iterator.anext() File "/home/kevin/.local/lib/python3.8/site-packages/gradio/utils.py", line 512, in anext return await anyio.to_thread.run_sync( File "/home/kevin/.local/lib/python3.8/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "/home/kevin/.local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread return await future File "/home/kevin/.local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 851, in run result = context.run(func, *args) File "/home/kevin/.local/lib/python3.8/site-packages/gradio/utils.py", line 495, in run_sync_iterator_async return next(iterator) File "/home/kevin/.local/lib/python3.8/site-packages/gradio/utils.py", line 649, in gen_wrapper yield from f(*args, **kwargs) File "/home/kevin/Qwen-Agent/qwen_server/workstation_server.py", line 392, in generate raise ValueError(ex) ValueError: Error code: 404 - {'error': {'message': "model 'qwen2/Qwen2-beta-4B-Chat' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}
报错信息里有:
{'error': {'message': "model 'qwen2/Qwen2-beta-4B-Chat' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}
从这个信息看,是访问了正确地址了。但是ollama服务那边还没配置好qwen模型。
可以先参考 https://qwen.readthedocs.io/en/latest/run_locally/ollama.html 确保用ollama运行qwen本身能跑通。
然后 --llm qwen2/Qwen2-beta-4B-Chat
这个我不确定是否是正确的,也许可以试试 --llm qwen4b
或 --lm qwen:4b
(不确定是否正确)。
P.S.: 公司网络不好,我至今还没安装成功ollama。。。
您好,我刚刚终于安装上了ollama,并用类似如下的命令跑通了:
python run_server.py --llm qwen:0.5b --model_server http://127.0.0.1:11434/v1
我这里还是跑不通,qwen模型是可以在ollama启动,并且正常使用的,如下是qwen模型启动后的对话:
在运行ollama run qwen时,在ollama server端看到的启动信息如下:
结合ollama server端的404报错,是不是browserqwen只要遇到了http 404错误,标准的错误输出都是moduel没有找到?我试着把--llm参数设置为qwen2, qwen2:4B等各种值,都是同样的404错误,说模块没有找到: ValueError: Error code: 404 - {'error': {'message': "model 'qwen2:4B' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}
ValueError: Error code: 404 - {'error': {'message': "model 'qwen2' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}
ValueError: Error code: 404 - {'error': {'message': "model 'qwen:4b' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}} ValueError: Error code: 404 - {'error': {'message': "model 'qwen:0.5b' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}
我这里还是跑不通,qwen模型是可以在ollama启动,并且正常使用的,如下是qwen模型启动后的对话:
在运行ollama run qwen时,在ollama server端看到的启动信息如下:
结合ollama server端的404报错,是不是browserqwen只要遇到了http 404错误,标准的错误输出都是moduel没有找到?我试着把--llm参数设置为qwen2, qwen2:4B等各种值,都是同样的404错误,说模块没有找到: ValueError: Error code: 404 - {'error': {'message': "model 'qwen2:4B' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}
ValueError: Error code: 404 - {'error': {'message': "model 'qwen2' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}
ValueError: Error code: 404 - {'error': {'message': "model 'qwen:4b' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}} ValueError: Error code: 404 - {'error': {'message': "model 'qwen:0.5b' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}
我是先分别执行了:
ollama serve
ollama run qwen:0.5b
# 然后 /bye 退出
python run_server.py --llm qwen:0.5b --model_server http://127.0.0.1:11434/v1
之后就行了
我是先分别执行了:
ollama serve
ollama run qwen:0.5b # 然后 /bye 退出
python run_server.py --llm qwen:0.5b --model_server http://127.0.0.1:11434/v1
之后就行了
是的,这是启动ollama运行大模型的指令,我和你的不同之处在于使用了4b模型,也就是ollama run qwen时会默认下载运行4b模型,这也是为什么运行browserqwen的质量中--llm参数和你不同。 我想知道,在哪个源代码文件可以修改提交给ollama的completion请求URL?希望可以把request提交到"/v1/chat/",而不是“"/v1/chat/completions"。这个可能是问题的关键。
我是先分别执行了:
ollama serve
ollama run qwen:0.5b # 然后 /bye 退出
python run_server.py --llm qwen:0.5b --model_server http://127.0.0.1:11434/v1
之后就行了
是的,这是启动ollama运行大模型的指令,我和你的不同之处在于使用了4b模型,也就是ollama run qwen时会默认下载运行4b模型,这也是为什么运行browserqwen的质量中--llm参数和你不同。 我想知道,在哪个源代码文件可以修改提交给ollama的completion请求URL?希望可以把request提交到"/v1/chat/",而不是“"/v1/chat/completions"。这个可能是问题的关键。
这个requests的提交路径并不是我写的,而是我调用openai的sdk,openai的sdk提交的。参见 https://github.com/QwenLM/Qwen-Agent/blob/main/qwen_agent/llm/oai.py
注:我的用openai sdk版本号为1.13.3
建议先测试下 https://github.com/ollama/ollama/blob/main/docs/openai.md 文档给出的这个例子:
from openai import OpenAI
client = OpenAI(
base_url='http://localhost:11434/v1/',
# required but ignored
api_key='ollama',
)
chat_completion = client.chat.completions.create(
messages=[
{
'role': 'user',
'content': 'Say this is a test',
}
],
model='llama2', # 改成 qwen:4b
)
确保这个能跑通。
是不是说qwen必须运行在兼容OpenAI API的模式下才行?
是不是说qwen必须运行在兼容OpenAI API的模式下才行?
是的。本地部署模型时,部署方式需要提供openai兼容的api,参见:https://github.com/QwenLM/Qwen-Agent/issues/95#issuecomment-1987785011 。不想部署自己的模型的话,可以使用dashscope提供的云服务。