Enzyme.jl icon indicating copy to clipboard operation
Enzyme.jl copied to clipboard

inelegant solution to #1821

Open ExpandingMan opened this issue 1 year ago • 1 comments

This is a particularly inelegant solution to #1821.

What I do here for Symmetric and Hermitian is take a redundant basis. This looks a bit strange because I both set the individual elements and also set the type of the resulting matrix. The reason for this is that if I do not set all the elements of the underlying buffer, Enzyme's autodiff returns the wrong result, but the wrapper type must also be set because currently BatchDuplicated only allows arguments and bases to be of the same type. I also go out of my way to call similar here (e.g. rather than zeros) to make it less likely we will run afoul of BatchDuplicated.

I think this works in general but I have not exhaustively checked it. I'm not sure how enzyme would be expected to deal with complex arguments here, so I've only tested purely real cases so far.

To see an example of how it works, consider symmetric matrix $X$ and $$f(X) = X_{11}^2 + 2 X_{12}^2 + 3 X_{21}^2 - X_{22}^2$$ For extra clarity, write $X = \left(\matrix{a & b \cr b & c}\right)$ so that $$f(X) = a^2 + 5 b^2 - c^2$$ The derivative with respect to the off-diagonal element $b$ is clearly $10b$.

I impelement this in Julia via

f0(x) = x[1,1]^2 + 2*x[1,2]^2 + 3*x[2,1]^2 - x[2,2]^2

Then, for example

X = Hermitian(Float64[1 2; 2 3])

so that

julia> gradient(Forward, f0, X)[1]
2×2 Enzyme.TupleArray{Float64, (2, 2), 4, 2}:
  2.0  20.0
 20.0  -6.0

This also works if the underlying data for X is [1.0 2.0; 0.0 3.0].

This will remain a draft until I come up with more comprehensive tests for it.

ExpandingMan avatar Oct 01 '24 23:10 ExpandingMan

Alright, this looks like it works. Again, while this may be a fix this solution is not very nice and involves a lot of redundancy, but I think anything better would be significantly more complicated.

I certainly would advise giving this a good think before merging. Keep in mind that if a user is really determined, they can create this on their own using only the existing public API (gradient and shadows) so I would find it perfectly reasonable if you decided it was better to close this and leave it broken for the foreseeable future rather than adding more behavior that it would be better to later break.

ExpandingMan avatar Oct 02 '24 22:10 ExpandingMan