GLM-4 icon indicating copy to clipboard operation
GLM-4 copied to clipboard

为啥system消息会传两次?

Open ciaoyizhen opened this issue 1 year ago • 5 comments

我在看basic_demo里面的openai_server.py 里面有一个process_message的函数 他里面的判断都是判断tools_choices是否是"none",然后我调用openai_request那个文件,他传的都是None

然后在最后一个处理前,msg已经有system和user了,然后最后还有一个判断,insert了system到最开始,就导致实际处理有两个system,我总感觉不对啊 system传一个不就行了吗?

ciaoyizhen avatar Aug 05 '24 09:08 ciaoyizhen

可以仔细看一下,有tool和没tool最终呢都只会有一个sys的

zhipuch avatar Aug 29 '24 03:08 zhipuch

@zhipuch https://github.com/THUDM/GLM-4/blob/main/basic_demo/openai_api_server.py#L320 自己debug一下就知道了 公司里的那个 不能发出来 外面租的服务器 不让发请求

具体就是process_message 那个函数 在上面处理的时候 system是会放进去的 在最后一个else那里

然后跑完了这个之后 会有一个往消息最前面插入一个system消息

ciaoyizhen avatar Sep 01 '24 04:09 ciaoyizhen

请你发一下出错的请求body吧,我同步一下

zhipuch avatar Sep 02 '24 08:09 zhipuch

@zhipuch 不好意思 pigcha 出事了 一直找不到好用的vpn 一直在搞 然后我也一直想在租的服务器里debug 然后把图截出来,一直搞不定,恒源云(可能不行把) 然后我今天突发奇想,我直接把处理message的代码从里面剥离出来就行 然后下面是我的代码,从仓库里拉出来的,没有进行修改,把调用逻辑拿出来了而已

from typing import List, Literal, Optional, Union
from pydantic import BaseModel, Field


class FunctionCall(BaseModel):
    name: Optional[str] = None
    arguments: Optional[str] = None
    
class ChatCompletionMessageToolCall(BaseModel):
    index: Optional[int] = 0
    id: Optional[str] = None
    function: FunctionCall
    type: Optional[Literal["function"]] = 'function'

class ChoiceDeltaToolCallFunction(BaseModel):
    name: Optional[str] = None
    arguments: Optional[str] = None

class ChatMessage(BaseModel):
    # “function” 字段解释:
    # 使用较老的OpenAI API版本需要注意在这里添加 function 字段并在 process_messages函数中添加相应角色转换逻辑为 observation

    role: Literal["user", "assistant", "system", "tool"]
    content: Optional[str] = None
    function_call: Optional[ChoiceDeltaToolCallFunction] = None
    tool_calls: Optional[List[ChatCompletionMessageToolCall]] = None
    

def process_messages(messages, tools=None, tool_choice="none"):
    _messages = messages
    processed_messages = []
    msg_has_sys = False

    def filter_tools(tool_choice, tools):
        function_name = tool_choice.get('function', {}).get('name', None)
        if not function_name:
            return []
        filtered_tools = [
            tool for tool in tools
            if tool.get('function', {}).get('name') == function_name
        ]
        return filtered_tools

    if tool_choice != "none":
        if isinstance(tool_choice, dict):
            tools = filter_tools(tool_choice, tools)
        if tools:
            processed_messages.append(
                {
                    "role": "system",
                    "content": None,
                    "tools": tools
                }
            )
            msg_has_sys = True

    if isinstance(tool_choice, dict) and tools:
        processed_messages.append(
            {
                "role": "assistant",
                "metadata": tool_choice["function"]["name"],
                "content": ""
            }
        )

    for m in _messages:
        role, content, func_call = m.role, m.content, m.function_call
        tool_calls = getattr(m, 'tool_calls', None)

        if role == "function":
            processed_messages.append(
                {
                    "role": "observation",
                    "content": content
                }
            )
        elif role == "tool":
            processed_messages.append(
                {
                    "role": "observation",
                    "content": content,
                    "function_call": True
                }
            )
        elif role == "assistant":
            if tool_calls:
                for tool_call in tool_calls:
                    processed_messages.append(
                        {
                            "role": "assistant",
                            "metadata": tool_call.function.name,
                            "content": tool_call.function.arguments
                        }
                    )
            else:
                for response in content.split("\n"):
                    if "\n" in response:
                        metadata, sub_content = response.split("\n", maxsplit=1)
                    else:
                        metadata, sub_content = "", response
                    processed_messages.append(
                        {
                            "role": role,
                            "metadata": metadata,
                            "content": sub_content.strip()
                        }
                    )
        else:
            if role == "system" and msg_has_sys:
                msg_has_sys = False
                continue
            processed_messages.append({"role": role, "content": content})

    if not tools or tool_choice == "none":
        for m in _messages:
            if m.role == 'system':
                processed_messages.insert(0, {"role": m.role, "content": m.content})
                break
    return processed_messages


if __name__ == "__main__":
    messages = [
        {
            "role": "system",
            "content": "请在你输出的时候都带上“喵喵喵”三个字,放在开头。",
        },
        {
            "role": "user",
            "content": "你是谁"
        }
    ]

    ## 这里是因为 请求的时候 带的结果是List messages 所以处理了一下
    message = []
    for m in messages:
        m = ChatMessage(**m)
        message.append(m)
    message = process_messages(message)
    print(message)

image 可以看到 这个system变成了两个 案例都是从仓库里直接拉的 就改了一下调用逻辑

抱歉这么迟才回

ciaoyizhen avatar Sep 15 '24 03:09 ciaoyizhen

就是仓库里的那个simple_chat

ciaoyizhen avatar Sep 17 '24 12:09 ciaoyizhen