Is it possible to use this method for ViTs trained for regression?
I finetuned a model from ‘timm’ for a regression problem by setting ‘nb_classes’ to 1 and then using MSE loss. How can I use this repo to display a saliency map without distinct classes?
I also have the same question for the authors.
I also have the same question for the authors.
Hello,
I am also trying to apply this method to models that were fine-tuned for regression tasks. These models use a custom Vision Transformer implementation from the RETFound repository (specifically the ViT-L architecture with 16×16 patch size).
I successfully loaded their weights using the vit_large_patch16_224() function from this repo (from ViT_LRP.py), setting num_classes=1.
To extract relevance maps, I then use an adapter that wraps the LRP class and calls generate_LRP() with index=None and method='transformer_attribution'. I also post-process the output to reshape the relevance scores to a [14,14] patch grid.
Do you think your method can apply correctly in this set up?