ChainRules.jl icon indicating copy to clipboard operation
ChainRules.jl copied to clipboard

[WIP] Complex valued SVD

Open GiggleLiu opened this issue 3 years ago • 3 comments

Nice project, I like the test coverage of this package. Sorry for the long delay (see https://github.com/GiggleLiu/BackwardsLinalg.jl/issues/17). I tried multiple times to add the backward rules to ChainRules, but feeling hard to write tests that fitting this framework. Can some one help me?

  • added a new function for complex valued svd back-propagation. But feeling hard to test functions with gauge problem.
  • added a safe_inv function to invert the singular matrix subtraction matrix safely. Otherwise, it can break up easily when there are degenerate spectrums.
  • also, I am planning to fix the real valued SVD to reduce the number of matrix multiplication.

Refs

GiggleLiu avatar Mar 24 '21 01:03 GiggleLiu

Before I review, can you comment on how this implementation differs from the existing real one? That is, it looks like the current real implementation would also work on complex numbers if we released the real type constraint, so I'm wondering how this rule differs from that one, and can they be unified.

sethaxen avatar Mar 24 '21 06:03 sethaxen

Before I review, can you comment on how this implementation differs from the existing real one? That is, it looks like the current real implementation would also work on complex numbers if we released the real type constraint, so I'm wondering how this rule differs from that one, and can they be unified.

Sure, if you open the third link, you will see the red term. That one is the missing term in the real version. For more detailed description, you need to check the first link to the original paper. I agree they should be in a single function, this is just a draft.

GiggleLiu avatar Mar 24 '21 09:03 GiggleLiu

Also, please note the safe_inv is important in some applications. e.g. In many applications in physics, there are degenerate singular values due to the high symmetry. Normally, when the denominator |s_i^2 - s_j^2| is zero, the numerator should also be zero (the loss is gauge invariant). However, this assumption is vulnerable due to rounding errors, and this is why we need to handle this case manually to avoid fake singularity in the gradient.

GiggleLiu avatar Mar 24 '21 11:03 GiggleLiu