fromage icon indicating copy to clipboard operation
fromage copied to clipboard

🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".

Results 5 fromage issues
Sort by recently updated
recently updated
newest added

Hello, may I ask if you could share the code for downloading the cc3m dataset? After downloading the dataset, how do I prepare the tsv file?

Hi, Thanks for sharing your great paper and code! I am wondering about a use case on retrieval only mode (without dialogue or question ansewring). is training the "Image-captioning" model...

Thanks for the great work! I saw in the appendix that you report the results on VQAv2 dataset which is really interesting and showcases the effectiveness of FROMAGe. We are...

Hi, thanks again for the nice work! I was trying to reproduce the experiments in VQAv2 using your pretrained weights and evaluate using this [repo](https://github.com/GT-Vision-Lab/VQA) mentioned in the paper. However,...

Hi, Thanks alot for your great work. I am evaluating replacing the OPT LLM with other LLMs such as Mistral-7B-v0.1 7B or Phi-3-mini-4k-instruct. I had to make minor code modifications...