swift InternVL-Chat 1.5 Fine-Tune About Visual Grounding Task

InternVL-Chat 1.5 Fine-Tune About Visual Grounding Task

Open MVP-D77 opened this issue 1 month ago • 1 comments

有关InternVL-Chat 1.5最佳实践中所提到的微调过程，想问下可以对Visual Grounding任务进行微调么？以及如果可以的话，prompt模板是什么样子的？

{
    "id": "n0167", 
    "image": "xxxxx", 
    "conversations": [
        {
            "from": "human", 
            "value": "<image>\nPlease provide the bounding box coordinate of the region this sentence describes: <ref>xxx</ref>"
        }, 
        {
            "from": "gpt", 
            "value": "<ref>xxx</ref><box>[[308, 765, 592, 1094]]</box>"
        }
    ]
}

请问如上述的模板可以参考哪里修改成可以微调的数据？除此之外，如果我有多个微调数据集，怎么把他们同时交给微调脚本，以及可否设置每个数据集进行几次访问，数据增强等？

May 11 '24 08:05 MVP-D77

swift swift copied to clipboard

InternVL-Chat 1.5 Fine-Tune About Visual Grounding Task

swift
swift copied to clipboard