PyTorch-Model-Compare Works fine with the whole model but raise "NANs" on selected layers.

Works fine with the whole model but raise "NANs" on selected layers.

Open PengHongyiNTU opened this issue 2 years ago • 3 comments

When I was trying to compare the same model trained on different datasets, I encountered a weird problem:

It works fine when I compare all layers: cka = CKA(model1, model2, device='cuda', model1_name='model1', model2_name='model2')

But, when I try to compare a selected subset of layers: cka = CKA(model1, model2, device='cuda', model1_name='model1', model2_name='model2', model1_layers=list(model1.state_dict().keys())[:5], model2_layers=list(model2.state_dict().keys())[:5]) It raises:

HSIC computation resulted in NANs

Do you have any idea how to fix this? Thank you very much.

Sep 14 '22 20:09 PengHongyiNTU

In my case, the NaNs issue was raised when there was no hook created (here), in which case no feature would return here, resulting in the divided-by-zero error at L180. It might be helpful to see if the hooks are being generated properly.

Oct 16 '22 21:10 kssteven418

I think an issue I am having is no matter what, in line 134 ((N - 1) * (N - 2)) ends up equalling 0 because the matrix being passes is always a size 2x2 from around line 181 K = X @ X.t().

I discovered this happens when the batch size is <= 2. So if anyone else has this issue this might be why!

I set this batch size because this method is so slow and memory consuming. Are there any tricks to it without using large batch sizes or computation?

Nov 03 '22 03:11 Maddy12

I got the same issue: https://github.com/AntixK/PyTorch-Model-Compare/issues/10. Looking for solutions. Thanks!

Mar 11 '23 16:03 bryanbocao

PyTorch-Model-Compare PyTorch-Model-Compare copied to clipboard

Works fine with the whole model but raise "NANs" on selected layers.

PyTorch-Model-Compare
PyTorch-Model-Compare copied to clipboard