PubMedCLIP Issue with VQA RAD training

Hi, I have the same problem as https://github.com/sarahESL/PubMedCLIP/issues/8#issue-1196762293, and I can not solve this problem by re-running the script.

Traceback (most recent call last):
  File "main/main.py", line 85, in <module>
    question_classify.load_state_dict(pretrained_model)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1052, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for classify_model:
        size mismatch for w_emb.emb.weight: copying a param with shape torch.Size([1178, 300]) from checkpoint, the shape in current model is torch.Size([1260, 300]).

In https://github.com/sarahESL/PubMedCLIP/blob/main/QCR_PubMedCLIP/lib/utils/create_dictionary.py, create_dictionary function use both train / test file to create dictionary (with nvocab = 1260). But in train code, the tf-idf loading module uses only train set (nvocab = 1178). I guess that this problem is due to the difference between the question used in the create dictionary and the question set used in the tf-idf calculation. Could you please solve this problem?

Jul 08 '22 07:07 dek924

Thanks for your report. I have been trying to fix this, but with every error getting fixed, sth else comes up! The problem is that the main QCR project (accessible at https://github.com/Awenbocc/med-vqa) does not provide the script for creating the dictionary, labels, etc input files. My scripts create_dictionary, create_labels, etc have been mainly developed and tested using the SLAKE dataset. But changing them to also support VQA-RAD seems to be not super straightforward.

So until I figure out a solution that supports both datasets, my suggestion for you is to use the already processed data that the QCR project provides available at https://github.com/Awenbocc/med-vqa/tree/master/data. This is the data that I also used for the rest of the pipeline when experimenting with VQA-RAD.

Jul 13 '22 08:07 sarahESL

Thank you for your reply.

Since dataset_RAD and dataset_SLAKE file does not exist in the MEVF_PubMedCLIP folder, I use the code in QCR_PubMedCLIP. However, if I use the processed data you gave, I get an error because answers that appear simultaneously in open and close questions are not distinguished here, but these are separated in the dataloader that use in QCR model. I guess that the dataloader used by MEVF is missing. It works well when I using dataset_RAD.py of the MEVF github with adding the clip options. Could you please check this problem? Thank you.

Jul 13 '22 13:07 dek924

PubMedCLIP PubMedCLIP copied to clipboard

Issue with VQA RAD training

PubMedCLIP
PubMedCLIP copied to clipboard