FastChat
FastChat copied to clipboard
Is the OpenAI-compatible API sending the `functions` argument to the model?
I'm currently using the OS model functionary, which supports functions in a manner similar to how GPT operates through the OpenAI API. I've successfully deployed the model worker and proceeded to make chat completion requests using the functions feature:
curl {API}/chat/completions -H "Content-Type: application/json" -d '{
"model": "functionary-7b",
"messages": [{"role": "user", "content": "salute Anthony"}],
"functions": [
{"name": "salute",
"description": "This function can be used to salute someone.",
"parameters": {
"properties": {
"who": {
"description": "Name of whom to salute. o7",
"title": "Who",
"type": "string"
}
},
"required": [
"who"
],
"title": "salute",
"type": "object"
}
}]
}'
The expected outcome was to receive a message that included the function_call argument, indicating the use of the salute function with the argument Anthony. However, the actual result was as follows:
{"id":"chatcmpl-fUcuEKMeCkavtjsPCzeKft","object":"chat.completion","created":1694683275,"model":"functionary-7b","choices":[{"index":0,"message":{"role":"assistant","content":"\nSalute!\n"},"finish_reason":"stop"}],"usage":{"prompt_tokens":579,"total_tokens":585,"completion_tokens":6}}
It appears that the functions argument is being ignored altogether. My question is whether fastchat/serve/openai_api_server.py is indeed sending the functions argument to the model.
openai_api_server only accept these parameters
@app.post("/v1/chat/completions", dependencies=[Depends(check_api_key)])
async def create_chat_completion(request: ChatCompletionRequest):
class ChatCompletionRequest(BaseModel):
model: str
messages: Union[str, List[Dict[str, str]]]
temperature: Optional[float] = 0.7
top_p: Optional[float] = 1.0
n: Optional[int] = 1
max_tokens: Optional[int] = None
stop: Optional[Union[str, List[str]]] = None
stream: Optional[bool] = False
presence_penalty: Optional[float] = 0.0
frequency_penalty: Optional[float] = 0.0
user: Optional[str] = None
modify protocol/openai_api_protocol.py
class ChatCompletionRequest(BaseModel):
model: str
messages: Union[str, List[Dict[str, str]]]
temperature: Optional[float] = 0.7
top_p: Optional[float] = 1.0
n: Optional[int] = 1
max_tokens: Optional[int] = None
stop: Optional[Union[str, List[str]]] = None
stream: Optional[bool] = False
presence_penalty: Optional[float] = 0.0
frequency_penalty: Optional[float] = 0.0
user: Optional[str] = None
functions: Optional[List[Any]] = None
modify serve/openai_api_server.py
@app.post("/v1/chat/completions", dependencies=[Depends(check_api_key)])
async def create_chat_completion(request: ChatCompletionRequest):
"""Creates a completion for the chat message"""
print("\n fastchat request:", request)
error_check_ret = await check_model(request)
if error_check_ret is not None:
return error_check_ret
error_check_ret = check_requests(request)
if error_check_ret is not None:
return error_check_ret
async with httpx.AsyncClient() as client:
worker_addr = await get_worker_address(request.model, client)
gen_params = await get_gen_params(
request.model,
worker_addr,
request.messages,
temperature=request.temperature,
top_p=request.top_p,
max_tokens=request.max_tokens,
echo=False,
stream=request.stream,
stop=request.stop,
)
gen_params["functions"] = request.functions
error_check_ret = await check_length(
request,
gen_params["prompt"],
gen_params["max_new_tokens"],
worker_addr,
client,
)
@jiaolongxue Since you seem to be familiar with it, could you make a pull request to add this functionality to the library?
So long as fastchat not support functions parameter yet, you can use functions calling with open source models (such as Qwen-1.8b, Qwen-14b, Chatglm3) by following steps:
- render functions in messages directly, then send them to fastchat openai api server.
- define custom agent using langchain which has customized prompt and output parser
a reference can be found here: qwen-agent
modify protocol/openai_api_protocol.py
class ChatCompletionRequest(BaseModel): model: str messages: Union[str, List[Dict[str, str]]] temperature: Optional[float] = 0.7 top_p: Optional[float] = 1.0 n: Optional[int] = 1 max_tokens: Optional[int] = None stop: Optional[Union[str, List[str]]] = None stream: Optional[bool] = False presence_penalty: Optional[float] = 0.0 frequency_penalty: Optional[float] = 0.0 user: Optional[str] = None functions: Optional[List[Any]] = Nonemodify serve/openai_api_server.py
@app.post("/v1/chat/completions", dependencies=[Depends(check_api_key)]) async def create_chat_completion(request: ChatCompletionRequest): """Creates a completion for the chat message""" print("\n fastchat request:", request) error_check_ret = await check_model(request) if error_check_ret is not None: return error_check_ret error_check_ret = check_requests(request) if error_check_ret is not None: return error_check_ret async with httpx.AsyncClient() as client: worker_addr = await get_worker_address(request.model, client) gen_params = await get_gen_params( request.model, worker_addr, request.messages, temperature=request.temperature, top_p=request.top_p, max_tokens=request.max_tokens, echo=False, stream=request.stream, stop=request.stop, ) gen_params["functions"] = request.functions error_check_ret = await check_length( request, gen_params["prompt"], gen_params["max_new_tokens"], worker_addr, client, )
This still doesn't work.Can you help me find out what's wrong? Thanks!
is there any appetite for this to be integrated through a PR? Open to taking a look if so
@ckgresla Can you? I know we're a bit late on this, but should't be that hard. Have a look also into https://github.com/lm-sys/FastChat/pull/2085