【feature】llm添加openai风格服务端支持(自定义代理地址和模型名称)
https://github.com/InternLM/lagent/pull/294
目的
支持已有的llm服务。
使用方法
PS 如果lagent还未合并该pr。需要手动复制lagent项目294的pr文件,覆盖python安装包lagent路径里面的相应文件。(或者自行打包lagent去安装)
mindsearch
- mindsearch\agent\models.py
- 修改api_base 、model_name 、key(如果服务端需要的话)
- 命令行:python -m mindsearch.app --lang ch --model_format gptstyle --search_engine SearxngSearch --asy
from lagent.llms import (
GPTStyleAPI,
GPTAPI,
INTERNLM2_META,
HFTransformerCasualLM,
LMDeployClient,
LMDeployServer,
)
api_base = 'http://192.168.26.213:13000/v1/chat/completions' # oneapi
# model_name = "Baichuan2-Turbo"
model_name ="deepseek-r1-14b"
gptstyle = dict(
type=GPTStyleAPI,
model_type=model_name,
# key=os.environ.get("OPENAI_STYLE_API_KEY", "sk-IXgCTwuoEwxL1CiBE4744688D8094521B70f4aDeE6830c5e"),
# api_base=os.environ.get("OPENAI_STYLE_API_BASE",api_base),
key="sk-CZOUavQGNzkkQjZr626908A0011040F8B743C526F315D6Ee",
api_base=api_base,
)
PS: INTERNLM似乎有做适配改造。ds-14b有自己的逻辑,似乎不适合mindsearch
oneapi测试通过如下:
C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\agents\stream.py:212: UserWarning: Neither plugin nor interpreter executor is initialized. An exception will be thrown when the agent call a tool. warnings.warn( INFO: 127.0.0.1:65137 - "POST /solve HTTP/1.1" 200 OK response {"id":"chatcmpl-3124175edf96481b8fe2a952fd5f8748","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' response {"id":"chatcmpl-3124175edf96481b8fe2a952fd5f8748","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 702, in streaming if choice['finish_reason'] == 'stop': KeyError: 'finish_reason'
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 745, in _stream_chat async for msg in streaming(raw_response): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 708, in streaming raise Exception(msg) from exc Exception: response {"id":"chatcmpl-3124175edf96481b8fe2a952fd5f8748","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason'
response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 702, in streaming if choice['finish_reason'] == 'stop': KeyError: 'finish_reason'
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 745, in _stream_chat async for msg in streaming(raw_response): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 708, in streaming raise Exception(msg) from exc Exception: response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason'
ERROR:root:An error occurred while generating the response. Traceback (most recent call last): File "D:\MindSearch-main\mindsearch\app.py", line 142, in generate async for message in agent(inputs, session_id=session_id): File "D:\MindSearch-main\mindsearch\agent\streaming.py", line 50, in call async for response_message in self.forward(*message, session_id=session_id, **kwargs): File "D:\MindSearch-main\mindsearch\agent\mindsearch_agent.py", line 156, in forward async for message in self.agent(message, session_id=session_id, **kwargs): File "D:\MindSearch-main\mindsearch\agent\streaming.py", line 50, in call async for response_message in self.forward(*message, session_id=session_id, **kwargs): File "D:\MindSearch-main\mindsearch\agent\streaming.py", line 99, in forward async for model_state, response, _ in self.llm.stream_chat( File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 573, in stream_chat async for text in self._stream_chat(messages, **gen_params): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 773, in _stream_chat raise RuntimeError( RuntimeError: Calling OpenAI failed after retrying for 2 times. Check the logs for details. errmsg: response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 702, in streaming if choice['finish_reason'] == 'stop': KeyError: 'finish_reason'
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 745, in _stream_chat async for msg in streaming(raw_response): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 708, in streaming raise Exception(msg) from exc Exception: response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason'
C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\agents\stream.py:212: UserWarning: Neither plugin nor interpreter executor is initialized. An exception will be thrown when the agent call a tool. warnings.warn( INFO: 127.0.0.1:65137 - "POST /solve HTTP/1.1" 200 OK response {"id":"chatcmpl-3124175edf96481b8fe2a952fd5f8748","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' response {"id":"chatcmpl-3124175edf96481b8fe2a952fd5f8748","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 702, in streaming if choice['finish_reason'] == 'stop': KeyError: 'finish_reason'
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 745, in _stream_chat async for msg in streaming(raw_response): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 708, in streaming raise Exception(msg) from exc Exception: response {"id":"chatcmpl-3124175edf96481b8fe2a952fd5f8748","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason'
response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 702, in streaming if choice['finish_reason'] == 'stop': KeyError: 'finish_reason'
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 745, in _stream_chat async for msg in streaming(raw_response): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 708, in streaming raise Exception(msg) from exc Exception: response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason'
ERROR:root:An error occurred while generating the response. Traceback (most recent call last): File "D:\MindSearch-main\mindsearch\app.py", line 142, in generate async for message in agent(inputs, session_id=session_id): File "D:\MindSearch-main\mindsearch\agent\streaming.py", line 50, in call async for response_message in self.forward(*message, session_id=session_id, **kwargs): File "D:\MindSearch-main\mindsearch\agent\mindsearch_agent.py", line 156, in forward async for message in self.agent(message, session_id=session_id, **kwargs): File "D:\MindSearch-main\mindsearch\agent\streaming.py", line 50, in call async for response_message in self.forward(*message, session_id=session_id, **kwargs): File "D:\MindSearch-main\mindsearch\agent\streaming.py", line 99, in forward async for model_state, response, _ in self.llm.stream_chat( File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 573, in stream_chat async for text in self._stream_chat(messages, **gen_params): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 773, in _stream_chat raise RuntimeError( RuntimeError: Calling OpenAI failed after retrying for 2 times. Check the logs for details. errmsg: response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason' Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 702, in streaming if choice['finish_reason'] == 'stop': KeyError: 'finish_reason'
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 745, in _stream_chat async for msg in streaming(raw_response): File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\llms\openai_style.py", line 708, in streaming raise Exception(msg) from exc Exception: response {"id":"chatcmpl-016f9b82f10b43369ea9fbc3796e2717","object":"chat.completion.chunk","created":1739267631,"model":"qwen2:7b-instruct-fp16","choices":[{"index":0,"delta":{"role":"assistant","content":"To"}}]} lead to exception of 'finish_reason'
大佬知道这是为什么么
你使用的哪种渠道?大概率是伪openai风格。 KeyError: 'finish_reason'。 或者是openai_style.py没有适配的那么全。 你使用的oneapi吗
你使用的哪种渠道?很大概率是 αopenaistyle。KeyError: 'finish_reason'。 或者是openai_style.py没有那么全。 你使用的oneapi吗?
如图所示,我使用的是oneapi的qwen2:7b-instruct-fp16,openai_style.py是直接复制的项目294的文件
另外在test_gptstyleapi文件中,是可以正常返回结果的
感觉可能是流式输出的时候没有finish字段,所以我在openai_style.py中将代码修改为
finish_reason = choice.get('finish_reason')
if finish_reason and finish_reason == 'stop':
return
模型可以输出
但是当进行创建图和节点的时候会进行报错
ERROR:root:An error occurred while generating the response.
Traceback (most recent call last):
File "D:\MindSearch-main\mindsearch\app.py", line 142, in generate
async for message in agent(inputs, session_id=session_id):
File "D:\MindSearch-main\mindsearch\agent\streaming.py", line 50, in call
async for response_message in self.forward(*message, session_id=session_id, **kwargs):
File "D:\MindSearch-main\mindsearch\agent\mindsearch_agent.py", line 185, in forward
for graph_exec in gen:
File "C:\Users\11238.conda\envs\aisearch\lib\site-packages\lagent\utils\util.py", line 138, in iter
self.ret = yield from self.generator
File "D:\MindSearch-main\mindsearch\agent\graph.py", line 268, in run
while graph.n_active_tasks:
AttributeError: 'WebSearchGraph' object has no attribute 'n_active_tasks'
会不会是模型的问题,或者是其他问题,因为当我使用本地internlm/internlm2_5-7b-chat的时候是可以正常运行的
qwen2:7b-instruct-fp16是通过什么方式接入的oneapi。 qwen2:7b-instruct-fp16的server服务,直接请求。返回的格式和openai风格应该是有略微差异的。我私有化百川,发现就是返回结构有略微差别