Weizhi Wang
Weizhi Wang
> Update: The error occurs when I use the faiss-gpu PIP package from https://github.com/kyamagu/faiss-wheels (in Rocky Linux 9 with Python 3.9 and CUDA 11.7). If I use Anaconda3 with Python...
Thanks for the great comments and ideas! We are currently working on adapting VaLM to vision-language tasks, especially image captioning and vqa. We would add more experimental results to the...
Hi, apology for releasing the code a little bit late due to personal issues. The code is available to public now. I will immediately organize the documentation for easier reproduction...
Hi, Thank you so much for your interest in our work. 1) The feature vectors with dimension 768 take up 274GB disk space, and the trained faiss index takes up...
Hi, one of the 9 prompts for color reasoning in code is different from the paper appendix. I will update the paper accordingly. Please use the prompts in evaluation_scripts/verify_color_prediction.py
> > Most of downloading urls of images in ocr_vqa dataset are no longer available. Everyone has to rerun the downloading script to get a small portion of ocr_vqa images...
你点一下链接应该直接可以下载的,我刚点了一下没问题
Thank you so much for your interest in our work. I clean up all unnecessary scripts and files. Also training and evaluation scripts are released just now as demos. I...