Best format to use for Mixtral finetuning
Hi all,
I'm setting up a Mixtral SFT finetuning job, following the example in examples/scripts/sft.py. I have a question about dataset formatting, using the feature of training on completions only (described here https://huggingface.co/docs/trl/sft_trainer#train-on-completions-only).
When I prepare the text field of my dataset, should I add BOS and EOS tokens in the input strings directly?
Basically, I am wondering if the correct format is:
f"<s>[INST] {prompt} [/INST]{response}</s>"f"<s>[INST] {prompt} [/INST]{response}"f"[INST] {prompt} [/INST]{response}</s>"f"[INST] {prompt} [/INST]{response}"
I had assumed it was f"<s>[INST] {prompt} [/INST]{response}</s>" (as this is what the Mixtral tokenizer prints out when it applies the chat template), although I don't know if this causes issues when passed in as the text field, in case the tokenizer adds BOS and EOS tokens again under the hood.
I would appreciate any clarification or tips you can provide - thank you!
I think you do not need BOS but you need EOS.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.