dsmil-wsi
dsmil-wsi copied to clipboard
Same attention score and the pre-trained aggregators.
Dear bin, Thank you for your great work!
-
When I reproduce the results on c-16 and TCGA, I follow the provided readme: 1) Using pre-computed features from
Download feature vectors for MIL network --> python download.py --dataset=tcga/c16
,2)Training the model (with all hyperparameters as default)python train_tcga.py --dataset=TCGA-lung-default/python train_tcga.py --dataset=Camelyon16 --num_classes=1
. For c16, I found there is mild degradation in accuracy of 91% unlike #54 with only 60%. But I did find each patch will produce the same attention score as #54. For TCGA, the same attention score can also be found but with quite promising results (e.g.,train loss: 0.3307 test loss: 0.3239, average score: 0.9000, AUC: class-0>>0.9715089374829871|class-1>>0.9658833136738953
). The problem of the same attention score on c16 may sometimes be solved by restarting the training with theinit.pth
loaded, but never solved on TCGA. How to do with it? -
When I use the provided pre-trained aggregator (
.test/weights/aggregator.pth
or.test-c16/weights/aggregator.pth
) to the test set of pre-computed feature fromDownload feature vectors for MIL network --> python download.py --dataset=tcga/c16
. I got reasonable results (average score: 0.9125, AUC: class-0>>0.9546666666666667
) on c-16, but unreasonable ones (average score: 0.6857, AUC: class-0>>0.8621722166772525|class-1>>0.8949278649850286
) on TCGA. I wonder whether these pre-trained aggregators can only work with the provided embedder (test/weights/embedder.pth
or.test-c16/weights/embedder.pth
) instead of pre-computed features? In other words, the pre-computed features are not generated by these pre-trained embedders?
Looking forward to your help! Best, Tiancheng Lin
Hi bin, I solve the problem of same attention score by removing the dimension normalization, and the performance is comparable. However, I am still confused about the pre-trained models and pre-computed features.
Hi, please make sure that the weights are indeed fully loaded into your model without mismatch; you can set strict=True
in torch.load()
. There are multiple embedder.pth
files available, and the downloaded features were computed using one of them (possibly not the same one included in the download because I updated them once afterward). But you can always use that embedder to recompute new features and then test with that aggregator. You can find all embedders I trained for the two data sets in Camelyon16 and TCGA
Hi, please make sure that the weights are indeed fully loaded into your model without mismatch; you can set
strict=True
intorch.load()
. There are multipleembedder.pth
files available, and the downloaded features were computed using one of them (possibly not the same one included in the download because I updated them once afterward). But you can always use that embedder to recompute new features and then test with that aggregator. You can find all embedders I trained for the two data sets in Camelyon16 and TCGA
Hi, thank you for your quick help! Could you release more aggregators?
Hi, please make sure that the weights are indeed fully loaded into your model without mismatch; you can set
strict=True
intorch.load()
. There are multipleembedder.pth
files available, and the downloaded features were computed using one of them (possibly not the same one included in the download because I updated them once afterward). But you can always use that embedder to recompute new features and then test with that aggregator. You can find all embedders I trained for the two data sets in Camelyon16 and TCGAHi, thank you for your quick help! Could you release more aggregators?
One more question about 'init.pth': As mentioned in #26 , it is trained with a few interactions on the Camelyon16 dataset following the original training/testing split. I would appreciate if you could share your detailed settings used for it. Thank you very much!
Dear bin, Thank you for your great work!
- When I reproduce the results on c-16 and TCGA, I follow the provided readme: 1) Using pre-computed features from
Download feature vectors for MIL network --> python download.py --dataset=tcga/c16
,2)Training the model (with all hyperparameters as default)python train_tcga.py --dataset=TCGA-lung-default/python train_tcga.py --dataset=Camelyon16 --num_classes=1
. For c16, I found there is mild degradation in accuracy of 91% unlike Problem of reproduce Camelyon16 result #54 with only 60%. But I did find each patch will produce the same attention score as Problem of reproduce Camelyon16 result #54. For TCGA, the same attention score can also be found but with quite promising results (e.g.,train loss: 0.3307 test loss: 0.3239, average score: 0.9000, AUC: class-0>>0.9715089374829871|class-1>>0.9658833136738953
). The problem of the same attention score on c16 may sometimes be solved by restarting the training with theinit.pth
loaded, but never solved on TCGA. How to do with it?- When I use the provided pre-trained aggregator (
.test/weights/aggregator.pth
or.test-c16/weights/aggregator.pth
) to the test set of pre-computed feature fromDownload feature vectors for MIL network --> python download.py --dataset=tcga/c16
. I got reasonable results (average score: 0.9125, AUC: class-0>>0.9546666666666667
) on c-16, but unreasonable ones (average score: 0.6857, AUC: class-0>>0.8621722166772525|class-1>>0.8949278649850286
) on TCGA. I wonder whether these pre-trained aggregators can only work with the provided embedder (test/weights/embedder.pth
or.test-c16/weights/embedder.pth
) instead of pre-computed features? In other words, the pre-computed features are not generated by these pre-trained embedders?Looking forward to your help! Best, Tiancheng Lin
Hi, @HHHedo & @binli123 . I have the same question with @HHHedo. I focus on the TCGA part now, and followed the instruction.
- Using pre-computed features from
Download feature vectors for MIL network --> $ python download.py --dataset=tcga
- Training the model (with all hyperparameters as default)
$ python train_tcga.py --dataset=TCGA-lung-default
For TCGA, I got the same attention score with @HHHedo , I don't know why at the first epoch, the score is so high. You can see my screenshots....
and after the 3rd epoch, there is no other better model to be solved. That's very confused me.
Could you tell me why and how to fix it? Thank you very much.