FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

Fixed bug in openai_api_server.py

Open Perseus14 opened this issue 2 years ago • 10 comments
trafficstars

Conversation template has existing values for messages. This will cause token count to increase and could lead to confusing responses from the model.

Resolved it by conv.messages=[].

Perseus14 avatar May 31 '23 10:05 Perseus14

Having tested this change out in production, I can confirm it does not seem to break anything for us. And it may very well have improved the responses significantly according to one user report.

digisomni avatar May 31 '23 17:05 digisomni

https://github.com/lm-sys/FastChat/issues/1641

new bug

lplzyp avatar Jun 09 '23 06:06 lplzyp

Can this PR (https://github.com/lm-sys/FastChat/pull/1528) by @andy-yang-1 fix this problem?

merrymercy avatar Jun 09 '23 16:06 merrymercy

这是来自QQ邮箱的假期自动回复邮件。您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。

lplzyp avatar Jun 09 '23 16:06 lplzyp

Hey @merrymercy, it doesn't resolve the issue. The default messages in the template are still added.

logging the prompt after line 269 in openai_api_server.py can show the default messages that are used.

### Human: Got any creative ideas for a 10 year old’s birthday? ### Assistant: Of course! Here are some creative ideas for a 10-year-old's birthday party: ....

Perseus14 avatar Jun 09 '23 18:06 Perseus14

Which model did you use? Some models need this one-shot example, while others do not need this. The recommended way is to register a special conversation template for your model.

merrymercy avatar Jun 09 '23 18:06 merrymercy

I am using fast-chat t5. It works better without the template. My use case involves a chat conversation between user and bot with context from documents. So having the template hinders the model

Perseus14 avatar Jun 09 '23 18:06 Perseus14

I see. We may consider dropping the one shot example for fastchat-t5 or providing you a new api so that you can override the default conversion template. You current patch will break other models so we won’t accept it.

@DachengLi1 could you check whether the T5 works after we drop the one shot example?

merrymercy avatar Jun 12 '23 18:06 merrymercy

Taking a look, will get back to you in a few hours.

DachengLi1 avatar Jun 12 '23 18:06 DachengLi1

@merrymercy I tried several prompts in the 80 questions. It looks good to me. Attached are two examples. Screenshot 2023-06-12 at 1 33 38 PM Screenshot 2023-06-12 at 1 34 00 PM

DachengLi1 avatar Jun 12 '23 20:06 DachengLi1

There could be future models that might not require one-shot examples. Would it be better for the user to determine whether to use the default one-shot example or provide their own? Ideally during inference but more practically during model loading.

Perseus14 avatar Jun 23 '23 06:06 Perseus14

@Perseus14 Yes. Now you can implement and register your own model adapter to choose the conversation template.

You can override the get_conv_template in https://github.com/lm-sys/FastChat/blob/a10cb06846138337c09e674d47faaaec9fdd1b3c/fastchat/model/model_adapter.py#L455 with

    def get_default_conv_template(self, model_path: str) -> Conversation:
        return get_conv_template("zero_shot")

I will close this PR for now. If you are still interested in contributing to this, could you implement this priority-based model adapter registration mechanism? https://github.com/lm-sys/FastChat/blob/a10cb06846138337c09e674d47faaaec9fdd1b3c/fastchat/model/model_adapter.py#L73-L75

By doing this, you can register your own model adapter/conversation template at runtime without modifying the source code of FastChat.

merrymercy avatar Jul 05 '23 09:07 merrymercy