JuntongWang
Results
2
comments of
JuntongWang
In my current attempts, I can get the logits correctly for single-image inference. However, issues arise when I try to process multiple images. Could you please provide a minimal, runnable...
> 官方demo > > ``` > # multi-image multi-round conversation, separate images (多图多轮对话,独立图像) > pixel_values1 = load_image('./examples/image1.jpg', max_num=12).to(torch.bfloat16).cuda() > pixel_values2 = load_image('./examples/image2.jpg', max_num=12).to(torch.bfloat16).cuda() > pixel_values = torch.cat((pixel_values1, pixel_values2), dim=0) >...