zwhus
Thanks, but could you provide the relevant metrics(T2I/I2T R@1) on COCO and Flickr30k for Phi-3V on with-finetuning setting?
Thanks