MiniCPM-V
MiniCPM-V copied to clipboard
How to organize data, which can be fine-tuned with both image-text data, as well as purely textual data.
When fine-tuning with LoRA, is it necessary to use data that includes images? If pure text data is used, would it affect the model's performance (it should not, as some open-source datasets for MLM models include SFT with pure text question-answer pairs)?
How should the JSON file be structured?
22 how can i finetune this model with Text-only data and Image-Text data in same dataset?
您可以自己调整一下数据集的写法,可以根据条件判断一下