text-generation-webui
text-generation-webui copied to clipboard
EOS token question in multi-round in OASST
I'm trying to figure out if the format for multi-round conversations has an EOS token appended at the end of each assistant reply in the history, or none at all (it will be added by the response), and maybe it varies by model.
It is a general question (so I'm using the generic tokens in the web ui for the example below), but I'm specifically wondering about the recent open-assistant models, because I don't think previous multi-round trains (egs., Vicuna) did it the same way.
Case 1: Input to model (with history) is:
<|user|>Question 1<|endoftext|>
<|bot|>Answer 1<|endoftext|>
<|user|>Question 2<|endoftext|>
<|bot|>Answer 2<|endoftext|>
<|user|>Question 3<|endoftext|>
Response will be:
<|bot|>Answer 3<|endoftext|> EOS_TOKEN
Case 2: Input is:
<|user|>Question 1<|endoftext|>
<|bot|>Answer 1<|endoftext|> EOS_TOKEN
<|user|>Question 2<|endoftext|>
<|bot|>Answer 2<|endoftext|> EOS_TOKEN
<|user|>Question 3<|endoftext|>
Response will be (same as Case 1):
<|bot|>Answer 3<|endoftext|> EOS_TOKEN
Looking at chat.py, it looks like Case 1 is always assumed in web ui. However, looking at the format_pairs function (Line 152 at the time of this writing) from https://github.com/LAION-AI/Open-Assistant/blob/5ce45f31ebf0a1c7b389b88c9755aea8393e8f9f/model/model_training/custom_datasets/formatting.py I'm wondering if open assistant was trained with Case 2 instead.
Anyone know any better?
Adding to that (related question). Looks like the webUI actually inputs the following format in instruct mode: (slightly different than my case examples in that the extra <|bot|> prompt is part of the input, not output):
<|user|>Question 1<|endoftext|>
<|bot|>Answer 1<|endoftext|>
<|user|>Question 2<|endoftext|>
<|bot|>Answer 2<|endoftext|>
<|user|>Question 3<|endoftext|>
<|bot|>
whereas in the open assistant training script, the loss function mask includes the assistant token. So they are expecting something similar to my case examples in the previous post. Dunno if that matters either.
I wish this stuff was more standardized. We're getting there...
This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.