MedCLIP icon indicating copy to clipboard operation
MedCLIP copied to clipboard

Model weights don't seem to load correctly while running the example

Open kyleliang919 opened this issue 2 years ago • 7 comments

The results I got also seem to be different from expectation. I am wondering if it's something to do with the below warning. Here is the logit, which is completely different from the example shown in README.

{'logits': tensor([[0.3603, 0.4735, 0.1625, 0.2380, 0.3830]], device='cuda:0',
       grad_fn=<StackBackward0>), 'class_names': ['Atelectasis', 'Cardiomegaly', 'Consolidation', 'Edema', 'Pleural Effusion']}
Some weights of the model checkpoint at microsoft/swin-tiny-patch4-window7-224 were not used when initializing SwinModel: ['classifier.weight', 'classifier.bias']
- This IS expected if you are initializing SwinModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing SwinModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at emilyalsentzer/Bio_ClinicalBERT were not used when initializing BertModel: ['cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

kyleliang919 avatar May 07 '23 18:05 kyleliang919

i also got wrong result:0.3580, 0.4737, 0.1565, 0.2269, 0.3839 not same but close to yours.Guess some weights are not loaded properly, or just random init

Have you found any solution?

imdoublecats avatar Jul 12 '23 08:07 imdoublecats

The results I got also seem to be different from expectation. I am wondering if it's something to do with the below warning. Here is the logit, which is completely different from the example shown in README.

{'logits': tensor([[0.3603, 0.4735, 0.1625, 0.2380, 0.3830]], device='cuda:0',
       grad_fn=<StackBackward0>), 'class_names': ['Atelectasis', 'Cardiomegaly', 'Consolidation', 'Edema', 'Pleural Effusion']}
Some weights of the model checkpoint at microsoft/swin-tiny-patch4-window7-224 were not used when initializing SwinModel: ['classifier.weight', 'classifier.bias']
- This IS expected if you are initializing SwinModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing SwinModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at emilyalsentzer/Bio_ClinicalBERT were not used when initializing BertModel: ['cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

Can you share the code for zeroshot classification ?

deepankarvarma avatar Feb 26 '24 06:02 deepankarvarma

这是来自QQ邮箱的假期自动回复邮件。   您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。

imdoublecats avatar Feb 26 '24 06:02 imdoublecats

这是来自QQ邮箱的假期自动回复邮件。   您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。

imdoublecats avatar Apr 06 '24 05:04 imdoublecats

我也遇到了相同的问题,运行测试用例的结果是{'logits': tensor([[0.3618, 0.4737, 0.1641, 0.2459, 0.3842]], device='cuda:0',与样例结果完全不同,而这似乎是因为模型权重没能正确加载: image 请问有什么解决方法吗

XNLHZ avatar Apr 06 '24 05:04 XNLHZ

I have encountered the same issue, hope the authors can address it.

shunliu01 avatar May 06 '24 13:05 shunliu01

这是来自QQ邮箱的假期自动回复邮件。   您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。

imdoublecats avatar May 06 '24 13:05 imdoublecats