pytorch-minimize
pytorch-minimize copied to clipboard
LSMR fails with NaN output for trivial problems
Thanks for the nice library. I wanted to try LSMR but my first attempt to use it with a trivial problem failed with NaN output.
Steps to reproduce:
import torch
import torchmin
A = torch.eye(10)
xtrue = torch.zeros((10, 1))
b = A @ xtrue
x = torchmin.lstsq.lsmr.lsmr(A,b)[0]
print(x)
which resulted in
tensor([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan])
instead of 0s
Note that the same holds with a slightly less trivial but still trivial case:
import torch
import torchmin
A = torch.eye(10)
xtrue = torch.ones((10, 1))
b = A @ xtrue
x = torchmin.lstsq.lsmr.lsmr(A,b)[0]
print(x)
Comparing the source code with that of scipy points to the following issues.
normr and normar are initialiased differently and do not include the "convergence" tests at the very beginning
https://github.com/rfeinman/pytorch-minimize/blob/1017e9732db83fc9ffa2cb6e8316ed2bb0682d6a/torchmin/lstsq/lsmr.py#L148-L149
instead of
https://github.com/scipy/scipy/blob/2347d9309fbbabb1d3f89d35b7a42d0d53f002b2/scipy/sparse/linalg/_isolve/lsmr.py#L290-L307
Within the main iteration loop, the convergence test is only done every 10 iterations https://github.com/rfeinman/pytorch-minimize/blob/1017e9732db83fc9ffa2cb6e8316ed2bb0682d6a/torchmin/lstsq/lsmr.py#L257-L259 instead of every time https://github.com/scipy/scipy/blob/2347d9309fbbabb1d3f89d35b7a42d0d53f002b2/scipy/sparse/linalg/_isolve/lsmr.py#L409
For the record, I made a few changes here: https://github.com/cai4cai/torchsparsegradutils/blob/main/torchsparsegradutils/utils/lsmr.py