FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

Llama 3.1 chat template has <|begin_of_text|> encoded twice

Open horsten opened this issue 1 year ago • 0 comments

In conversation.py (my comment):

        elif self.sep_style == SeparatorStyle.LLAMA3:
            # No! It's already added in encode_dialog_prompt chat_format.py
            #ret = "<|begin_of_text|>"
            if self.system_message:
                ret += system_prompt

And in encode_dialog_prompt:

        tokens = []
        tokens.append(self.tokenizer.special_tokens["<|begin_of_text|>"])
        for message in messages:

Not sure which one to keep, but only one of them for sure.

horsten avatar Sep 25 '24 00:09 horsten