DeepSeek-VL icon indicating copy to clipboard operation
DeepSeek-VL copied to clipboard

dataset format of pretraining stage

Open annopackage opened this issue 1 year ago • 0 comments
trafficstars

How did you unify the format of pretraining dataset? During supervised fine tuning stage, the training data are curated as question and answer pairs. For caption or detection dataset, I want to know if they follow the same format as sft data, and how to collect questions for these data as they originally only contains ground truth like caption or boxes?

annopackage avatar Jul 17 '24 05:07 annopackage