https://github.com/InternLM/lagent/pull/294

目的

支持已有的llm服务。

使用方法

PS 如果lagent还未合并该pr。需要手动复制lagent项目294的pr文件，覆盖python安装包lagent路径里面的相应文件。（或者自行打包lagent去安装）

mindsearch

mindsearch\agent\models.py
修改api_base 、model_name 、key（如果服务端需要的话）
命令行：python -m mindsearch.app --lang ch --model_format gptstyle --search_engine SearxngSearch --asy

from lagent.llms import (
  GPTStyleAPI,
    GPTAPI,
    INTERNLM2_META,
    HFTransformerCasualLM,
    LMDeployClient,
    LMDeployServer,
)
api_base = 'http://192.168.26.213:13000/v1/chat/completions' # oneapi
# model_name =  "Baichuan2-Turbo"
model_name ="deepseek-r1-14b"
gptstyle = dict(
    type=GPTStyleAPI,
    model_type=model_name,
    # key=os.environ.get("OPENAI_STYLE_API_KEY", "sk-IXgCTwuoEwxL1CiBE4744688D8094521B70f4aDeE6830c5e"),
    # api_base=os.environ.get("OPENAI_STYLE_API_BASE",api_base),
    key="sk-CZOUavQGNzkkQjZr626908A0011040F8B743C526F315D6Ee",
    api_base=api_base,
)

PS: INTERNLM似乎有做适配改造。ds-14b有自己的逻辑，似乎不适合mindsearch oneapi测试通过如下： 1738834592259

Feb 06 '25 10:02 QiuZiXian

C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\agents\stream.py:212: UserWarning: Neither plugin nor interpreter executor is initialized. An exception will be thrown when the agent call a tool. warnings.warn( INFO: 127.0.0.1:65137 - "POST /solve HTTP/1.1" 200 OK response {"id":"chatcmpl-3124175edf96481b8fe2a952fd5f8748","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' response {"id":"chatcmpl-3124175edf96481b8fe2a952fd5f8748","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 702, in streaming if choice['finish_reason'] == 'stop': KeyError: 'finish_reason'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 745, in _stream_chat async for msg in streaming(raw_response): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 708, in streaming raise Exception(msg) from exc Exception: response {"id":"chatcmpl-3124175edf96481b8fe2a952fd5f8748","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason'

response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 702, in streaming if choice['finish_reason'] == 'stop': KeyError: 'finish_reason'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 745, in _stream_chat async for msg in streaming(raw_response): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 708, in streaming raise Exception(msg) from exc Exception: response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason'

ERROR:root:An error occurred while generating the response. Traceback (most recent call last): File "D:\MindSearch-main\mindsearch\app.py", line 142, in generate async for message in agent(inputs, session_id=session_id): File "D:\MindSearch-main\mindsearch\agent\streaming.py", line 50, in call async for response_message in self.forward(*message, session_id=session_id, **kwargs): File "D:\MindSearch-main\mindsearch\agent\mindsearch_agent.py", line 156, in forward async for message in self.agent(message, session_id=session_id, **kwargs): File "D:\MindSearch-main\mindsearch\agent\streaming.py", line 50, in call async for response_message in self.forward(*message, session_id=session_id, **kwargs): File "D:\MindSearch-main\mindsearch\agent\streaming.py", line 99, in forward async for model_state, response, _ in self.llm.stream_chat( File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 573, in stream_chat async for text in self._stream_chat(messages, **gen_params): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 773, in _stream_chat raise RuntimeError( RuntimeError: Calling OpenAI failed after retrying for 2 times. Check the logs for details. errmsg: response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 702, in streaming if choice['finish_reason'] == 'stop': KeyError: 'finish_reason'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 745, in _stream_chat async for msg in streaming(raw_response): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 708, in streaming raise Exception(msg) from exc Exception: response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason'

Feb 11 '25 09:02 loliw

C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\agents\stream.py:212: UserWarning: Neither plugin nor interpreter executor is initialized. An exception will be thrown when the agent call a tool. warnings.warn( INFO: 127.0.0.1:65137 - "POST /solve HTTP/1.1" 200 OK response {"id":"chatcmpl-3124175edf96481b8fe2a952fd5f8748","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' response {"id":"chatcmpl-3124175edf96481b8fe2a952fd5f8748","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 702, in streaming if choice['finish_reason'] == 'stop': KeyError: 'finish_reason'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 745, in _stream_chat async for msg in streaming(raw_response): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 708, in streaming raise Exception(msg) from exc Exception: response {"id":"chatcmpl-3124175edf96481b8fe2a952fd5f8748","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason'

response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 702, in streaming if choice['finish_reason'] == 'stop': KeyError: 'finish_reason'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 745, in _stream_chat async for msg in streaming(raw_response): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 708, in streaming raise Exception(msg) from exc Exception: response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason'

ERROR:root:An error occurred while generating the response. Traceback (most recent call last): File "D:\MindSearch-main\mindsearch\app.py", line 142, in generate async for message in agent(inputs, session_id=session_id): File "D:\MindSearch-main\mindsearch\agent\streaming.py", line 50, in call async for response_message in self.forward(*message, session_id=session_id, **kwargs): File "D:\MindSearch-main\mindsearch\agent\mindsearch_agent.py", line 156, in forward async for message in self.agent(message, session_id=session_id, **kwargs): File "D:\MindSearch-main\mindsearch\agent\streaming.py", line 50, in call async for response_message in self.forward(*message, session_id=session_id, **kwargs): File "D:\MindSearch-main\mindsearch\agent\streaming.py", line 99, in forward async for model_state, response, _ in self.llm.stream_chat( File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 573, in stream_chat async for text in self._stream_chat(messages, **gen_params): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 773, in _stream_chat raise RuntimeError( RuntimeError: Calling OpenAI failed after retrying for 2 times. Check the logs for details. errmsg: response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 702, in streaming if choice['finish_reason'] == 'stop': KeyError: 'finish_reason'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 745, in _stream_chat async for msg in streaming(raw_response): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 708, in streaming raise Exception(msg) from exc Exception: response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason'

大佬知道这是为什么么

Feb 11 '25 09:02 loliw

你使用的哪种渠道？大概率是伪openai风格。 KeyError: 'finish_reason'。或者是openai_style.py没有适配的那么全。你使用的oneapi吗

Feb 11 '25 10:02 QiuZiXian

你使用的哪种渠道？很大概率是 αopenaistyle。KeyError: 'finish_reason'。或者是openai_style.py没有那么全。你使用的oneapi吗？

如图所示，我使用的是oneapi的qwen2:7b-instruct-fp16，openai_style.py是直接复制的项目294的文件 1739323591882(1) 另外在test_gptstyleapi文件中，是可以正常返回结果的 1739325308262 感觉可能是流式输出的时候没有finish字段，所以我在openai_style.py中将代码修改为 finish_reason = choice.get('finish_reason') if finish_reason and finish_reason == 'stop': return 模型可以输出但是当进行创建图和节点的时候会进行报错 ERROR:root:An error occurred while generating the response. Traceback (most recent call last): File "D:\MindSearch-main\mindsearch\app.py", line 142, in generate async for message in agent(inputs, session_id=session_id): File "D:\MindSearch-main\mindsearch\agent\streaming.py", line 50, in call async for response_message in self.forward(*message, session_id=session_id, **kwargs): File "D:\MindSearch-main\mindsearch\agent\mindsearch_agent.py", line 185, in forward for graph_exec in gen: File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\utils\util.py", line 138, in iter self.ret = yield from self.generator File "D:\MindSearch-main\mindsearch\agent\graph.py", line 268, in run while graph.n_active_tasks: AttributeError: 'WebSearchGraph' object has no attribute 'n_active_tasks' 会不会是模型的问题，或者是其他问题，因为当我使用本地internlm/internlm2_5-7b-chat的时候是可以正常运行的

Feb 12 '25 01:02 loliw

qwen2:7b-instruct-fp16是通过什么方式接入的oneapi。 qwen2:7b-instruct-fp16的server服务，直接请求。返回的格式和openai风格应该是有略微差异的。我私有化百川，发现就是返回结构有略微差别

Feb 12 '25 07:02 QiuZiXian

【feature】llm添加openai风格服务端支持(自定义代理地址和模型名称)

目的

使用方法

mindsearch