internvl-chatv1.2-plus 多张图片如何传给模型
您好,感谢关注,请问您是想要多图问答还是batch inference呢
您好,感谢关注,请问您是想要多图问答还是batch inference呢
您好,请问多图问答的话,数据格式该是怎样的?还是说目前不支持?谢谢。
我想问支不支持多图问答,一个问题包含5张图片发自我的 iPhone在 2024年4月27日,01:28,Zhe Chen @.***> 写道: 您好,感谢关注,请问您是想要多图问答还是batch inference呢
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>
1.5支持多图问答,格式见这里的readme: https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5
大致是这样:
# multi-round multi-image conversation
pixel_values1 = load_image('./examples/image1.jpg', max_num=6).to(torch.bfloat16).cuda()
pixel_values2 = load_image('./examples/image2.jpg', max_num=6).to(torch.bfloat16).cuda()
pixel_values = torch.cat((pixel_values1, pixel_values2), dim=0)
question = "详细描述这两张图片" # Describe the two pictures in detail
response, history = model.chat(tokenizer, pixel_values, question, generation_config, history=None, return_history=True)
print(question, response)
question = "这两张图片的相同点和区别分别是什么" # What are the similarities and differences between these two pictures
response, history = model.chat(tokenizer, pixel_values, question, generation_config, history=history, return_history=True)
print(question, response)
其实我想实现的是,一个promt中包含5张图片,问题中包含一张图片,模型从ABCD四张图片中选出一个作为答案返回