LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

Discrepancy with BLIP paper results when using PyTorch > 1.10

Open oscmansan opened this issue 1 year ago • 2 comments

I was trying to reproduce results with BLIP on VQAv2 test-dev and I observed a non-negligible difference between the VQA accuracy obtained using the published checkpoint (77.41%) and the number reported in the paper (78.25%).

These are the steps I followed:

  1. Clone this repo
  2. Install dependencies with pip install .
  3. Create a symlink cache/coco/images pointing to the local copy of the COCO images
  4. Modify lavis/projects/blip/eval/vqav2_eval.yaml as follows: image
  5. Run python -m torch.distributed.run --nproc_per_node=4 evaluate.py --cfg-path lavis/projects/blip/eval/vqav2_eval.yaml (note I only have 4 A100 GPUs available)
  6. Submit the test_vqa_result.json file generated in lavis/output/BLIP/VQA/... to EvalAI

After some debugging, I narrowed it down to a discrepancy in PyTorch versions: I was using the latest version (1.13.0), while LAVIS fixes the version to 1.10.0. So there is some change between PyTorch 1.10 and PyTorch 1.13 which causes a performance degradation when loading a checkpoint trained on 1.10. After downgrading the PyTorch version to 1.10.0, I am able to achieve 78.24% VQA accuracy on VQAv2 test-dev, almost the same number reported in the paper.

oscmansan avatar Nov 09 '22 15:11 oscmansan

Hi @oscmansan ,

Thanks for raising the issue and providing the detailed reporting. We are now subscribed to this issue.

I will do the following:

  • [ ] Check the issue with Torch>1.10.
  • [ ] Localize the source of issue.

However, please expect some delays and stick with Torch 1.10 before this is fully investigated.

Thanks.

dxli94 avatar Nov 10 '22 00:11 dxli94

I wonder how to install successfully with torch >1.10.0 There always is an error about torch vision 2023-01-18

HerocatUED avatar Jan 18 '23 03:01 HerocatUED

I got the same result (77.41%) with you and I will try torch 1.10 now.

stlmx avatar Oct 05 '23 17:10 stlmx

When using pytorch1.10, right result can be got.

stlmx avatar Oct 05 '23 19:10 stlmx