Difference between model output of eval.py and create_heatmaps.py
I am trying to use CLAM for a project using camelyon dataset. For a single slide when I run the eval script I get Tumor output for Tumor slide no problem here, but when I run the create_heatmaps.py to create the heatmap for same slide I got Normal as model prediction.
When I dive deeper into this I noticed that in the create_heatmaps.py model output is obtained by using the downsampled version of WSI. So I have two questions here:
1-) What is the reason behind for this difference, is this common, have you encountered it before? 2-) What happens if I remove the block heatmap calculation and use the Attention scores obtained using original parameters from the WSI and get the model prediction using these Attention scores. Reason I'm asking these because in the code you are using the scores calculated in block heatmap part as the ref_scores for second part. Is there any downsides of removing this?
Thank you.
hi, so i think the only difference between the prediction from eval.py and prediction in create_heatmaps.py (assuming you specified the config to use the same model checkpoint/same magnficiation level etc.), is that they might be using different set of segmentation/patching parameters. in other words, eval.py uses pre-patched/extracted features from create_patches_fp.py/extract_features_fp.py, whereas for convenience, create_heatmaps generates that on the fly (using potentially a different set of parameters) - i guess in some cases, this difference might be sufficient in driving the model to output a different prediction.
So I think I found the problem, in create_heatmaps.py line 187 we have this line: blocky_wsi_kwargs = {'top_left': None, 'bot_right': None, 'patch_size': patch_size, 'step_size': patch_size, here step_size is equal to patch_size that's why it's not patching same as config even if i change the step_size. when I changed the step_size parameter to be step_size: step_size it worked correctly and gave me the same result (although probabilities are still not equal) and fixed my problem.
Btw, what is your take on my other question? Is there any downsides of not using ref_scores in heatmap calculation?
it's totally fine to not use ref_scores if you don't plan on generating two different heatmaps for the same slide (one blocky, and one finegrained). The ref_scores are just there to ensure that percentile scores are computed based on the same set of reference scores.
@erentknn, Can you possibly share the config and process list csv? I would like to see how you are using that dataset.
@abdulsami34 I am using the training set of camelyon16 as for training and rest is testing, for config I used tcga.csv with use_otsu set to True. Didn't changed anything else.