PyTorch-Model-Compare
PyTorch-Model-Compare copied to clipboard
Works fine with the whole model but raise "NANs" on selected layers.
When I was trying to compare the same model trained on different datasets, I encountered a weird problem:
It works fine when I compare all layers:
cka = CKA(model1, model2, device='cuda', model1_name='model1', model2_name='model2')
But, when I try to compare a selected subset of layers:
cka = CKA(model1, model2, device='cuda', model1_name='model1', model2_name='model2', model1_layers=list(model1.state_dict().keys())[:5], model2_layers=list(model2.state_dict().keys())[:5])
It raises:
HSIC computation resulted in NANs
Do you have any idea how to fix this? Thank you very much.
In my case, the NaNs issue was raised when there was no hook created (here), in which case no feature would return here, resulting in the divided-by-zero error at L180. It might be helpful to see if the hooks are being generated properly.
I think an issue I am having is no matter what, in line 134 ((N - 1) * (N - 2))
ends up equalling 0 because the matrix being passes is always a size 2x2
from around line 181 K = X @ X.t()
.
I discovered this happens when the batch size is <= 2. So if anyone else has this issue this might be why!
I set this batch size because this method is so slow and memory consuming. Are there any tricks to it without using large batch sizes or computation?
I got the same issue: https://github.com/AntixK/PyTorch-Model-Compare/issues/10. Looking for solutions. Thanks!