ChainRules.jl icon indicating copy to clipboard operation
ChainRules.jl copied to clipboard

[WIP] Fix eigen rules for matrices with exactly repeating eigenvalues

Open sethaxen opened this issue 3 years ago • 4 comments

Based on discussion at https://github.com/google/jax/issues/669#issuecomment-800805598, this PR implements a simple fix for matrices with exactly repeating eigenvalues (which are rare). The PR still needs tests. While it's not too hard to contrive a Hermitian matrix with exactly repeating eigenvalues (e.g. a diagonal matrix), I've had no luck so far coming up with a non-Hermitian matrix that after eigen still has exactly equal eigenvalues. Perhaps for the purposes of such a test, we could mock eigen somehow?

sethaxen avatar Mar 17 '21 06:03 sethaxen

While it's not too hard to contrive a Hermitian matrix with exactly repeating eigenvalues (e.g. a diagonal matrix), I've had no luck so far coming up with a non-Hermitian matrix that after eigen still has exactly equal eigenvalues.

@wesselb any ideas? also more broadly, are yoiu able to review this?

We can using Mocking.jl, if we want, to mock this. Or SimpleMock.jl (which could be a test only dependency iirc)

oxinabox avatar Mar 17 '21 12:03 oxinabox

While it's not too hard to contrive a Hermitian matrix with exactly repeating eigenvalues (e.g. a diagonal matrix), I've had no luck so far coming up with a non-Hermitian matrix that after eigen still has exactly equal eigenvalues.

A simple example is a lowertriangular matrix with equal entries on the diagonal:

julia> using LinearAlgebra

julia> A = randn(4, 4);

julia> A[diagind(A)] .= 1;

julia> tril!(A);

julia> A
4×4 Array{Float64,2}:
  1.0        0.0        0.0       0.0
 -0.40898    1.0        0.0       0.0
  1.36608   -0.168648   1.0       0.0
  0.288037   0.103924  -0.734558  1.0

julia> eigen(A)
Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
values:
4-element Array{Float64,1}:
 1.0
 1.0
 1.0
 1.0
vectors:
4×4 Array{Float64,2}:
 0.0  0.0          0.0          2.16078e-46
 0.0  0.0          3.97991e-31  3.97991e-31
 0.0  3.02283e-16  3.02283e-16  3.02283e-16
 1.0  1.0          1.0          1.0

devmotion avatar Mar 17 '21 13:03 devmotion

While it's not too hard to contrive a Hermitian matrix with exactly repeating eigenvalues (e.g. a diagonal matrix), I've had no luck so far coming up with a non-Hermitian matrix that after eigen still has exactly equal eigenvalues.

@wesselb any ideas? also more broadly, are yoiu able to review this?

In addition to @devmotion's example, here is a non-triangular example:

julia> P = [0 0 0 1 0; 0 1 0 0 0; 1 0 0 0 0; 0 0 1 0 0; 0 0 0 0 1]
5×5 Matrix{Int64}:
 0  0  0  1  0
 0  1  0  0  0
 1  0  0  0  0
 0  0  1  0  0
 0  0  0  0  1

julia> eigvals(P)
5-element Vector{ComplexF64}:
               -0.5 - 0.8660254037844389im
               -0.5 + 0.8660254037844389im
 0.9999999999999998 + 0.0im
                1.0 + 0.0im
                1.0 + 0.0im

julia> eigvals(P)[4] == eigvals(P)[5]
true

I'm unfortunately not familiar enough with this derivative to know what's going on...

wesselb avatar Mar 18 '21 09:03 wesselb

While it's not too hard to contrive a Hermitian matrix with exactly repeating eigenvalues (e.g. a diagonal matrix), I've had no luck so far coming up with a non-Hermitian matrix that after eigen still has exactly equal eigenvalues

It's a funny sentence because technically hermitian matrices with repeated eigenvalues are comparatively more rare (codimension 2) than non-hermitian ones (codimension 1). One way to cook up examples with prescribed eigenvalues is to draw random matrices (either plain random for the non-hermitian case or unitary for the hermitian case) and do A = P D P^-1.

There's a bunch of code duplication in this PR, maybe define a divided_difference(fx, fy, x, y)?

There's nothing wrong with this PR per se, but I think it will only work on testcases that are not subject to roundoff error (basically, diagonally perturbed diagonal matrices). In all other cases the eigenvalues will never be exactly equal (unless there is a particular symmetry which is preserved in floating point arithmetic, which can definitely happen)

Fundamentally this is an unsolvable problem because perturbation of eigenvectors is undefined at degeneracies and ill-conditioned when close to degenerate. The trouble is even if eigenvectors are ill-defined at degeneracies, one is often interested in properties of (well-separated) subspaces, which is well-defined. However, there's no way to tell the AD "yeah, I'm computing all these eigenvectors separately, but in reality I'm going to do stuff on the eigenspace spanned by the first and second eigenvalue" (except possibly in the reverse diff case, where the "stuff" in question is known?) and so the AD has to read the user's mind. Maybe it should apply a rtol on the difference of eigenvalues which by default is n*eps, and in that case give zero no matter what the perturbation is? That would be consistent with how pinv works, and is probably the thing most users would want here. However that's a pretty big thing to hide from the user...

antoine-levitt avatar Mar 21 '21 19:03 antoine-levitt