InternVL
InternVL copied to clipboard
How to organize my customized dataset for finetuning InternVL2?
The document & guide for finetuning InternVL2 are generally clear. However, I overcome a problem with my dataset. As I read from the document & code, a dataset suitable for Internvl_chat finetuning might contain multiple "conversations" as shown below (Since its all single-turn in my case).
{
"id": "XX",
"image": "/path/to/image.jpg",
"conversations": [{
"from": "human",
"value": "text1"
},
{
"from": "gpt",
"value": "text2"
}]
}
My questions are:
- Is 'system' role supported in this format?
- In my case, I've multiple images that might be used at one time (all send by 'human'). How should I process my data? My data might look like:
{
"id": "XX",
"conversations": [{
"from": "human",
"value": "text <image1> some text <image2> some other text"
},
{
"from": "gpt",
"value": "text2"
}]
}
Thank you!
Will simply putting all images in "image" work for my case?
Hi, see this document to prepare training data: https://internvl.readthedocs.io/en/latest/get_started/chat_data_format.html
Thanks, I've already solved my issue. Sorry not closing the issue in time