starcoder icon indicating copy to clipboard operation
starcoder copied to clipboard

How to fine-tune Starchat-beta on my question-answer dataset?

Open AIAnytime opened this issue 1 year ago • 2 comments

I have a dataset that has two columns called:

Question Answer Questions like: "Write a python code to reverse a list". Answer: code for that question. I have looked at Starcoder finetune.py file for fine tuning but that doesn't work for starchat-beta. Can anyone share a colab notebook or some code snippet to fine-tune it?

AIAnytime avatar Aug 23 '23 10:08 AIAnytime

Hi. What do you mean by that doesn't work for starchat-beta?

Starchat-beta itself is already an instruction tuned model. It is a fine-tuned version of starcoderplus on open assistant guanaco dataset see model card. You can choose to further fine-tune it on your dataset but you'll have to comply (for better results) with the fine-tuning setup that was used in order to obtain starchat-beta from starcoderplus. Namely, the used a different template with a system prompt. For example

<|system|>
you are a helpful assistant.
<|end|>
<|user|>
Write a python code to reverse a list"
<|end|>
<|assistant|>
code for that question
<|end|>

If you want to fine-tune starchat-beta with this code you can, but you'll have to apply some changes. You can either start from finetune/finetune.py or chat/train.py.

ArmelRandy avatar Aug 25 '23 12:08 ArmelRandy

Thanks for your reply @ArmelRandy ... I have prepared the dataset accordingly as you stated above but the performance is not good after fine-tuning. It generates gibberish responses. Do you have any recommendations like for how many steps I should fine-tune it? I have 650 QA Pairs. Please suggest.....

AIAnytime avatar Aug 28 '23 05:08 AIAnytime