InstructionWild icon indicating copy to clipboard operation
InstructionWild copied to clipboard

Structure of the training data

Open gabriead opened this issue 1 year ago • 0 comments

Hey guys, I have two questions regarding the structure of your data set. In the ReadMe you say that you are using the "Alpaca" approach, where they use a triple of instruction, input, output. Why are you prepending the concrete instruction at the end and not at the beginning? I couldn't find any samples where you provide a triplet with instruction, input and output. What is the reason for that? More concretely you are using the sample_seed.jsonl to create instructions where do you store the input and output?

gabriead avatar Apr 27 '23 07:04 gabriead