Qwen-Agent icon indicating copy to clipboard operation
Qwen-Agent copied to clipboard

workstation的editor提交数据到错误的URL,造成http 404错误

Open LaoK263 opened this issue 1 year ago • 11 comments

在ollama中跑qwen2/Qwen2-beta-4B-Chat 模型,启动browserqwen后,workstation页面的editor会提交数据到错误的URL,ollama的API文档里提到completion的URL对应是 "/api/chat" (参见:https://github.com/ollama/ollama/blob/main/docs/api.md) ,但是点击“continue”按钮时,可以看到数据被提交到了"/api/chat/completions",从而造成http 404错误,browserqwen端报错信息 Traceback (most recent call last): File "/home/kevin/Qwen-Agent/qwen_server/workstation_server.py", line 387, in generate for chunk in output_beautify.convert_to_full_str_stream(response): File "/home/kevin/Qwen-Agent/qwen_server/output_beautify.py", line 85, in convert_to_full_str_stream for message_list in message_list_stream: File "/home/kevin/Qwen-Agent/qwen_agent/agent.py", line 65, in run for rsp in self._run(messages=new_messages, **kwargs): File "/home/kevin/Qwen-Agent/qwen_agent/agents/article_agent.py", line 37, in _run for trunk in res: File "/home/kevin/Qwen-Agent/qwen_agent/agent.py", line 65, in run for rsp in self._run(messages=new_messages, **kwargs): File "/home/kevin/Qwen-Agent/qwen_agent/prompts/write_from_scratch.py", line 52, in _run for trunk in res_sum: File "/home/kevin/Qwen-Agent/qwen_agent/agent.py", line 65, in run for rsp in self._run(messages=new_messages, **kwargs): File "/home/kevin/Qwen-Agent/qwen_agent/llm/base.py", line 423, in _convert_messages_iterator_to_target_type for messages in messages_iter: File "/home/kevin/Qwen-Agent/qwen_agent/llm/oai.py", line 60, in _chat_stream response = self._chat_complete_create(model=self.model, File "/home/kevin/Qwen-Agent/qwen_agent/llm/oai.py", line 49, in _chat_complete_create return client.chat.completions.create(*args, **kwargs) File "/home/kevin/.local/lib/python3.8/site-packages/openai/_utils/_utils.py", line 275, in wrapper return func(*args, **kwargs) File "/home/kevin/.local/lib/python3.8/site-packages/openai/resources/chat/completions.py", line 663, in create return self._post( File "/home/kevin/.local/lib/python3.8/site-packages/openai/_base_client.py", line 1200, in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) File "/home/kevin/.local/lib/python3.8/site-packages/openai/_base_client.py", line 889, in request return self._request( File "/home/kevin/.local/lib/python3.8/site-packages/openai/_base_client.py", line 980, in _request raise self._make_status_error_from_response(err.response) from None openai.NotFoundError: 404 page not found

Ollama服务器端的错误信息: llama_new_context_with_model: graph splits (measure): 1 time=2024-03-01T18:23:49.073+08:00 level=INFO source=dyn_ext_server.go:161 msg="Starting llama main loop" [GIN] 2024/03/01 - 18:23:49 | 200 | 5.4693876s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/03/01 - 18:25:59 | 404 | 2.7µs | 127.0.0.1 | POST "/api/chat/completions"

如何修复? 能否改为可配置completion URL呢?

LaoK263 avatar Mar 01 '24 10:03 LaoK263

根据 https://github.com/ollama/ollama/blob/main/docs/openai.md 这个文档,api_base(也叫 base_url)应该填类似 http://localhost:11434/v1/ 的值 —— 请注意有 /v1/ 这个后缀,这个后缀可能在您给的那个 api.md 里并没有提到。

注:我正在安装 ollama 进行尝试,网速有点慢。。

JianxinMa avatar Mar 01 '24 13:03 JianxinMa

我的ollama实例启动log显示base url没有/v1/,我用以下命令启动browserqwen还是报404错误: python3 run_server.py --llm qwen2/Qwen2-beta-4B-Chat --model_server http://127.0.0.1:11434/v1/api

ollama端的日志: llama_new_context_with_model: graph splits (measure): 1 time=2024-03-04T11:34:04.291+08:00 level=INFO source=dyn_ext_server.go:161 msg="Starting llama main loop" [GIN] 2024/03/04 - 11:34:04 | 200 | 4.3672342s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/03/04 - 11:35:33 | 200 | 3.4914991s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/03/04 - 11:38:33 | 404 | 2.9µs | 127.0.0.1 | POST "/v1/api/chat/completions"

所以我觉得应该参考他的endpoint API文档:https://github.com/ollama/ollama/blob/main/docs/api.md 。我也是新手,不知道是否可以在ollama在把qwen启动为兼容openai的API模式。

LaoK263 avatar Mar 04 '24 03:03 LaoK263

我的ollama实例启动log显示base url没有/v1/,我用以下命令启动browserqwen还是报404错误: python3 run_server.py --llm qwen2/Qwen2-beta-4B-Chat --model_server http://127.0.0.1:11434/v1/api

ollama端的日志: llama_new_context_with_model: graph splits (measure): 1 time=2024-03-04T11:34:04.291+08:00 level=INFO source=dyn_ext_server.go:161 msg="Starting llama main loop" [GIN] 2024/03/04 - 11:34:04 | 200 | 4.3672342s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/03/04 - 11:35:33 | 200 | 3.4914991s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/03/04 - 11:38:33 | 404 | 2.9µs | 127.0.0.1 | POST "/v1/api/chat/completions"

所以我觉得应该参考他的endpoint API文档:https://github.com/ollama/ollama/blob/main/docs/api.md 。我也是新手,不知道是否可以在ollama在把qwen启动为兼容openai的API模式。

试试 --model_server http://127.0.0.1:11434/v1 ?注意以 v1 结尾,不是以 v1/api 结尾

JianxinMa avatar Mar 04 '24 04:03 JianxinMa

用 --model_server http://127.0.0.1:11434/v1 参数还是同样的404错误,启动命令为: python3 run_server.py --llm qwen2/Qwen2-beta-4B-Chat --model_server http://127.0.0.1:11434/v1

在ollama server端看到的错误: llama_new_context_with_model: graph splits (measure): 1 time=2024-03-06T15:13:57.787+08:00 level=INFO source=dyn_ext_server.go:161 msg="Starting llama main loop" [GIN] 2024/03/06 - 15:13:57 | 200 | 4.8649114s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/03/06 - 15:14:23 | 200 | 10.8918156s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/03/06 - 15:15:38 | 404 | 303.5µs | 127.0.0.1 | POST "/v1/chat/completions"

在workstation的错误信息: Traceback (most recent call last): File "/home/kevin/Qwen-Agent/qwen_server/workstation_server.py", line 387, in generate for chunk in output_beautify.convert_to_full_str_stream(response): File "/home/kevin/Qwen-Agent/qwen_server/output_beautify.py", line 85, in convert_to_full_str_stream for message_list in message_list_stream: File "/home/kevin/Qwen-Agent/qwen_agent/agent.py", line 65, in run for rsp in self._run(messages=new_messages, **kwargs): File "/home/kevin/Qwen-Agent/qwen_agent/agents/article_agent.py", line 37, in _run for trunk in res: File "/home/kevin/Qwen-Agent/qwen_agent/agent.py", line 65, in run for rsp in self._run(messages=new_messages, **kwargs): File "/home/kevin/Qwen-Agent/qwen_agent/prompts/write_from_scratch.py", line 52, in _run for trunk in res_sum: File "/home/kevin/Qwen-Agent/qwen_agent/agent.py", line 65, in run for rsp in self._run(messages=new_messages, **kwargs): File "/home/kevin/Qwen-Agent/qwen_agent/llm/base.py", line 423, in _convert_messages_iterator_to_target_type for messages in messages_iter: File "/home/kevin/Qwen-Agent/qwen_agent/llm/oai.py", line 60, in _chat_stream response = self._chat_complete_create(model=self.model, File "/home/kevin/Qwen-Agent/qwen_agent/llm/oai.py", line 49, in _chat_complete_create return client.chat.completions.create(*args, **kwargs) File "/home/kevin/.local/lib/python3.8/site-packages/openai/_utils/_utils.py", line 275, in wrapper return func(*args, **kwargs) File "/home/kevin/.local/lib/python3.8/site-packages/openai/resources/chat/completions.py", line 663, in create return self._post( File "/home/kevin/.local/lib/python3.8/site-packages/openai/_base_client.py", line 1200, in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) File "/home/kevin/.local/lib/python3.8/site-packages/openai/_base_client.py", line 889, in request return self._request( File "/home/kevin/.local/lib/python3.8/site-packages/openai/_base_client.py", line 980, in _request raise self._make_status_error_from_response(err.response) from None openai.NotFoundError: Error code: 404 - {'error': {'message': "model 'qwen2/Qwen2-beta-4B-Chat' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/kevin/.local/lib/python3.8/site-packages/gradio/queueing.py", line 407, in call_prediction output = await route_utils.call_process_api( File "/home/kevin/.local/lib/python3.8/site-packages/gradio/route_utils.py", line 226, in call_process_api output = await app.get_blocks().process_api( File "/home/kevin/.local/lib/python3.8/site-packages/gradio/blocks.py", line 1550, in process_api result = await self.call_function( File "/home/kevin/.local/lib/python3.8/site-packages/gradio/blocks.py", line 1199, in call_function prediction = await utils.async_iteration(iterator) File "/home/kevin/.local/lib/python3.8/site-packages/gradio/utils.py", line 519, in async_iteration return await iterator.anext() File "/home/kevin/.local/lib/python3.8/site-packages/gradio/utils.py", line 512, in anext return await anyio.to_thread.run_sync( File "/home/kevin/.local/lib/python3.8/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "/home/kevin/.local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread return await future File "/home/kevin/.local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 851, in run result = context.run(func, *args) File "/home/kevin/.local/lib/python3.8/site-packages/gradio/utils.py", line 495, in run_sync_iterator_async return next(iterator) File "/home/kevin/.local/lib/python3.8/site-packages/gradio/utils.py", line 649, in gen_wrapper yield from f(*args, **kwargs) File "/home/kevin/Qwen-Agent/qwen_server/workstation_server.py", line 392, in generate raise ValueError(ex) ValueError: Error code: 404 - {'error': {'message': "model 'qwen2/Qwen2-beta-4B-Chat' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}

LaoK263 avatar Mar 06 '24 07:03 LaoK263

报错信息里有:

{'error': {'message': "model 'qwen2/Qwen2-beta-4B-Chat' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}

从这个信息看,是访问了正确地址了。但是ollama服务那边还没配置好qwen模型。

可以先参考 https://qwen.readthedocs.io/en/latest/run_locally/ollama.html 确保用ollama运行qwen本身能跑通。

然后 --llm qwen2/Qwen2-beta-4B-Chat 这个我不确定是否是正确的,也许可以试试 --llm qwen4b --lm qwen:4b (不确定是否正确)。

P.S.: 公司网络不好,我至今还没安装成功ollama。。。

JianxinMa avatar Mar 06 '24 07:03 JianxinMa

您好,我刚刚终于安装上了ollama,并用类似如下的命令跑通了:

python run_server.py --llm qwen:0.5b --model_server http://127.0.0.1:11434/v1

JianxinMa avatar Mar 06 '24 07:03 JianxinMa

我这里还是跑不通,qwen模型是可以在ollama启动,并且正常使用的,如下是qwen模型启动后的对话: qwen 在运行ollama run qwen时,在ollama server端看到的启动信息如下: ollma

结合ollama server端的404报错,是不是browserqwen只要遇到了http 404错误,标准的错误输出都是moduel没有找到?我试着把--llm参数设置为qwen2, qwen2:4B等各种值,都是同样的404错误,说模块没有找到: ValueError: Error code: 404 - {'error': {'message': "model 'qwen2:4B' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}

ValueError: Error code: 404 - {'error': {'message': "model 'qwen2' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}

ValueError: Error code: 404 - {'error': {'message': "model 'qwen:4b' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}} ValueError: Error code: 404 - {'error': {'message': "model 'qwen:0.5b' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}

LaoK263 avatar Mar 06 '24 08:03 LaoK263

我这里还是跑不通,qwen模型是可以在ollama启动,并且正常使用的,如下是qwen模型启动后的对话: qwen 在运行ollama run qwen时,在ollama server端看到的启动信息如下: ollma

结合ollama server端的404报错,是不是browserqwen只要遇到了http 404错误,标准的错误输出都是moduel没有找到?我试着把--llm参数设置为qwen2, qwen2:4B等各种值,都是同样的404错误,说模块没有找到: ValueError: Error code: 404 - {'error': {'message': "model 'qwen2:4B' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}

ValueError: Error code: 404 - {'error': {'message': "model 'qwen2' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}

ValueError: Error code: 404 - {'error': {'message': "model 'qwen:4b' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}} ValueError: Error code: 404 - {'error': {'message': "model 'qwen:0.5b' not found, try pulling it first", 'type': 'api_error', 'param': None, 'code': None}}

我是先分别执行了:

ollama serve
ollama run qwen:0.5b
# 然后 /bye 退出
python run_server.py --llm qwen:0.5b --model_server http://127.0.0.1:11434/v1

之后就行了

JianxinMa avatar Mar 06 '24 08:03 JianxinMa

我是先分别执行了:

ollama serve
ollama run qwen:0.5b
# 然后 /bye 退出
python run_server.py --llm qwen:0.5b --model_server http://127.0.0.1:11434/v1

之后就行了

是的,这是启动ollama运行大模型的指令,我和你的不同之处在于使用了4b模型,也就是ollama run qwen时会默认下载运行4b模型,这也是为什么运行browserqwen的质量中--llm参数和你不同。 我想知道,在哪个源代码文件可以修改提交给ollama的completion请求URL?希望可以把request提交到"/v1/chat/",而不是“"/v1/chat/completions"。这个可能是问题的关键。

LaoK263 avatar Mar 07 '24 01:03 LaoK263

我是先分别执行了:

ollama serve
ollama run qwen:0.5b
# 然后 /bye 退出
python run_server.py --llm qwen:0.5b --model_server http://127.0.0.1:11434/v1

之后就行了

是的,这是启动ollama运行大模型的指令,我和你的不同之处在于使用了4b模型,也就是ollama run qwen时会默认下载运行4b模型,这也是为什么运行browserqwen的质量中--llm参数和你不同。 我想知道,在哪个源代码文件可以修改提交给ollama的completion请求URL?希望可以把request提交到"/v1/chat/",而不是“"/v1/chat/completions"。这个可能是问题的关键。

这个requests的提交路径并不是我写的,而是我调用openai的sdk,openai的sdk提交的。参见 https://github.com/QwenLM/Qwen-Agent/blob/main/qwen_agent/llm/oai.py

注:我的用openai sdk版本号为1.13.3

JianxinMa avatar Mar 07 '24 01:03 JianxinMa

建议先测试下 https://github.com/ollama/ollama/blob/main/docs/openai.md 文档给出的这个例子:

from openai import OpenAI
client = OpenAI(
    base_url='http://localhost:11434/v1/',
    # required but ignored
    api_key='ollama',
)
chat_completion = client.chat.completions.create(
    messages=[
        {
            'role': 'user',
            'content': 'Say this is a test',
        }
    ],
    model='llama2',  # 改成 qwen:4b
)

确保这个能跑通。

JianxinMa avatar Mar 07 '24 02:03 JianxinMa

是不是说qwen必须运行在兼容OpenAI API的模式下才行?

LaoK263 avatar Mar 11 '24 08:03 LaoK263

是不是说qwen必须运行在兼容OpenAI API的模式下才行?

是的。本地部署模型时,部署方式需要提供openai兼容的api,参见:https://github.com/QwenLM/Qwen-Agent/issues/95#issuecomment-1987785011 。不想部署自己的模型的话,可以使用dashscope提供的云服务。

JianxinMa avatar Mar 11 '24 09:03 JianxinMa