musco-pytorch
musco-pytorch copied to clipboard
Running demo code results in "LinAlgError: SVD did not converge" or "ValueError: array must not contain infs or NaNs"
Like I already mentioned in Issue 13, the demo code seems to crash with an error.
from torchvision.models import resnet50
from flopco import FlopCo
from musco.pytorch import CompressorVBMF, CompressorPR, CompressorManual
model = resnet50(pretrained = True)
model.cuda()
model_stats = FlopCo(model, device = 'cuda')
compressor = CompressorVBMF(model,
model_stats,
ft_every=5,
nglobal_compress_iters=2)
while not compressor.done:
compressor.compression_step()
compressed_model = compressor.compressed_model
~/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py in _raise_linalgerror_svd_nonconvergence(err, flag)
104
105 def _raise_linalgerror_svd_nonconvergence(err, flag):
--> 106 raise LinAlgError("SVD did not converge")
107
108 def _raise_linalgerror_lstsq(err, flag):
LinAlgError: SVD did not converge
or
~/anaconda3/lib/python3.8/site-packages/numpy/lib/function_base.py in asarray_chkfinite(a, dtype, order)
495 a = asarray(a, dtype=dtype, order=order)
496 if a.dtype.char in typecodes['AllFloat'] and not np.isfinite(a).all():
--> 497 raise ValueError(
498 "array must not contain infs or NaNs")
499 return a
ValueError: array must not contain infs or NaNs
The output seems to be random and one of both, if code gets run multiple times.
I managed to fix it by replacing scikit-tensor-py3 calls with tensotly calls. The example works fine now, and I avoided also an ugly numpy&scipy downgrade, which was required by scikit-tensor-py3.
For anyone interested, here is what I did:
Remove from musco/pytorch/compressor/decompositions/tucker2.py any import to scikit-tensor-py3 functions
Add
import tensorly
tensorly.set_backend("pytorch")
in get_tucker_factors the weight line becomes:
weights = tensorly.tensor(self.weight.cpu())
The tucker call changes so that it uses tensorly.decomposition.tucker:
core, (U_cout, U_cin, U_dd) = tensorly.decomposition.tucker(weights, [self.ranks[0], self.ranks[1], weights.shape[-1]], init='nvecs')
Finally few lines down, in the same function, change core = core.dot(U_dd.T)
into core = core.matmul(U_dd.T)
to use pytorch matrix multiplication (.dot works only for 1D vectors).