LAVIS
LAVIS copied to clipboard
Discrepancy with BLIP paper results when using PyTorch > 1.10
I was trying to reproduce results with BLIP on VQAv2 test-dev and I observed a non-negligible difference between the VQA accuracy obtained using the published checkpoint (77.41%) and the number reported in the paper (78.25%).
These are the steps I followed:
- Clone this repo
- Install dependencies with
pip install .
- Create a symlink
cache/coco/images
pointing to the local copy of the COCO images - Modify
lavis/projects/blip/eval/vqav2_eval.yaml
as follows: - Run
python -m torch.distributed.run --nproc_per_node=4 evaluate.py --cfg-path lavis/projects/blip/eval/vqav2_eval.yaml
(note I only have 4 A100 GPUs available) - Submit the
test_vqa_result.json
file generated inlavis/output/BLIP/VQA/...
to EvalAI
After some debugging, I narrowed it down to a discrepancy in PyTorch versions: I was using the latest version (1.13.0), while LAVIS fixes the version to 1.10.0. So there is some change between PyTorch 1.10 and PyTorch 1.13 which causes a performance degradation when loading a checkpoint trained on 1.10. After downgrading the PyTorch version to 1.10.0, I am able to achieve 78.24% VQA accuracy on VQAv2 test-dev, almost the same number reported in the paper.
Hi @oscmansan ,
Thanks for raising the issue and providing the detailed reporting. We are now subscribed to this issue.
I will do the following:
- [ ] Check the issue with Torch>1.10.
- [ ] Localize the source of issue.
However, please expect some delays and stick with Torch 1.10 before this is fully investigated.
Thanks.
I wonder how to install successfully with torch >1.10.0
There always is an error about torch vision
I got the same result (77.41%) with you and I will try torch 1.10 now.
When using pytorch1.10, right result can be got.