FastChat
FastChat copied to clipboard
Fixed bug in openai_api_server.py
Conversation template has existing values for messages. This will cause token count to increase and could lead to confusing responses from the model.
Resolved it by conv.messages=[].
Having tested this change out in production, I can confirm it does not seem to break anything for us. And it may very well have improved the responses significantly according to one user report.
https://github.com/lm-sys/FastChat/issues/1641
new bug
Can this PR (https://github.com/lm-sys/FastChat/pull/1528) by @andy-yang-1 fix this problem?
这是来自QQ邮箱的假期自动回复邮件。您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。
Hey @merrymercy, it doesn't resolve the issue. The default messages in the template are still added.
logging the prompt after line 269 in openai_api_server.py can show the default messages that are used.
### Human: Got any creative ideas for a 10 year old’s birthday? ### Assistant: Of course! Here are some creative ideas for a 10-year-old's birthday party: ....
Which model did you use? Some models need this one-shot example, while others do not need this. The recommended way is to register a special conversation template for your model.
I am using fast-chat t5. It works better without the template. My use case involves a chat conversation between user and bot with context from documents. So having the template hinders the model
I see. We may consider dropping the one shot example for fastchat-t5 or providing you a new api so that you can override the default conversion template. You current patch will break other models so we won’t accept it.
@DachengLi1 could you check whether the T5 works after we drop the one shot example?
Taking a look, will get back to you in a few hours.
@merrymercy I tried several prompts in the 80 questions. It looks good to me. Attached are two examples.
There could be future models that might not require one-shot examples. Would it be better for the user to determine whether to use the default one-shot example or provide their own? Ideally during inference but more practically during model loading.
@Perseus14 Yes. Now you can implement and register your own model adapter to choose the conversation template.
You can override the get_conv_template in https://github.com/lm-sys/FastChat/blob/a10cb06846138337c09e674d47faaaec9fdd1b3c/fastchat/model/model_adapter.py#L455
with
def get_default_conv_template(self, model_path: str) -> Conversation:
return get_conv_template("zero_shot")
I will close this PR for now. If you are still interested in contributing to this, could you implement this priority-based model adapter registration mechanism? https://github.com/lm-sys/FastChat/blob/a10cb06846138337c09e674d47faaaec9fdd1b3c/fastchat/model/model_adapter.py#L73-L75
By doing this, you can register your own model adapter/conversation template at runtime without modifying the source code of FastChat.