Manifolds.jl icon indicating copy to clipboard operation
Manifolds.jl copied to clipboard

Add submersion metric for Stiefel manifold

Open sethaxen opened this issue 3 years ago • 21 comments

resolves #410, resolves #463

sethaxen avatar Sep 24 '22 13:09 sethaxen

Codecov Report

Merging #535 (bd42f04) into master (e3af13e) will decrease coverage by 0.01%. The diff coverage is 98.01%.

@@            Coverage Diff             @@
##           master     #535      +/-   ##
==========================================
- Coverage   98.95%   98.94%   -0.02%     
==========================================
  Files          99      100       +1     
  Lines        9495     9641     +146     
==========================================
+ Hits         9396     9539     +143     
- Misses         99      102       +3     
Impacted Files Coverage Δ
src/Manifolds.jl 100.00% <ø> (ø)
src/manifolds/StiefelSubmersionMetric.jl 97.81% <97.81%> (ø)
src/manifolds/StiefelCanonicalMetric.jl 98.00% <100.00%> (-0.04%) :arrow_down:
src/manifolds/StiefelEuclideanMetric.jl 97.95% <100.00%> (+0.39%) :arrow_up:
src/manifolds/Symmetric.jl 100.00% <100.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov[bot] avatar Sep 24 '22 14:09 codecov[bot]

There are a few more edge cases to handle, and docstrings and tests are still needed, but the PR should otherwise be function-complete.

sethaxen avatar Sep 24 '22 15:09 sethaxen

The implementation of the generic shooting method is now quite general. It differs in 2 ways from the one in the algorithm in the paper:

  1. the initial guess is not re-scaled after projection to have the same length as the gap vector. This is because we have a ProjectionInverseRetraction that just computes the projection without rescaling. This is probably not so important.
  2. norm(X) has been replaced with norm(M, p, X). For StiefelSubmersionMetric this will be more expensive, but it probably makes the method more generalizable, since norm(X) will not necessarily tell us anything about the magnitude of the vector.

I think the one shooting case the method doesn't cover is the one that minimizes the gradient. This should probably be its own inverse retraction type. Some demos:

julia> inverse_retraction = ShootingInverseRetraction(ExponentialRetraction(), ProjectionInverseRetraction(), ProjectionTransport(), 4, sqrt(eps()), 1_000);

julia> M = Sphere(5, ℂ);

julia> p = project(M, randn(ComplexF64, representation_size(M)));

julia> X = project(M, p, randn(representation_size(M)));

julia> X .*= inv(norm(M, p, X));

julia> isapprox(M, p, inverse_retract(M, p, exp(M, p, X), inverse_retraction), X)
true

sethaxen avatar Sep 25 '22 09:09 sethaxen

Cool! For the first point we could introduce a ScaledProjectionRetraction (which is a retraction as long as in the limit for smaller and smaller vectors the scaling tends to 1) and where one can change/adapt scaling.

I think the second change is, though more expensive, the better choice (in general).

kellertuer avatar Sep 25 '22 09:09 kellertuer

Cool! For the first point we could introduce a ScaledProjectionRetraction (which is a retraction as long as in the limit for smaller and smaller vectors the scaling tends to 1) and where one can change/adapt scaling.

I think this should be handled in its own PR, since it seems there would be a number of design decisions, and it doesn't seem to be essential for this PR (shooting works without the scaling).

sethaxen avatar Sep 25 '22 16:09 sethaxen

Yes you‘re right, might be too much for this PR.

kellertuer avatar Sep 25 '22 16:09 kellertuer

Should shooting and ShootingInverseRetraction be immediately upstreamed to ManifoldsBase?

sethaxen avatar Sep 25 '22 19:09 sethaxen

To me it looks generic enough that it could, @mateuszbaran what do you think?

kellertuer avatar Sep 25 '22 19:09 kellertuer

We should probably also use this to implement log for EuclideanMetric.

sethaxen avatar Sep 25 '22 20:09 sethaxen

To me it looks generic enough that it could, @mateuszbaran what do you think?

I will take a look at it today.

mateuszbaran avatar Sep 26 '22 07:09 mateuszbaran

To me it looks generic enough that it could, @mateuszbaran what do you think?

I will take a look at it today.

I agree, the shooting method is generic enough that it can be in ManifoldsBase.jl.

mateuszbaran avatar Sep 26 '22 08:09 mateuszbaran

I agree, the shooting method is generic enough that it can be in ManifoldsBase.jl.

Okay! I'll split it out into a PR directly to ManifoldsBase.

sethaxen avatar Sep 26 '22 08:09 sethaxen

I opted to create a factorization type and implement only the necessary operations on this type. This is better documented, and it also allows us to still use generic algorithms. e.g. we now use the generic shooting algorithm from ManifoldsBase, but for k < div(n, 2) we first compute the factorization and then run shooting with the factorization.

The factorization object is somewhat fragile (e.g. its broadcast implementation), and all implementations assume the same base point (U) is used (which we guarantee in our code but cannot guarantee in user code), so for now this is an internal implementation detail.

I'll add unit tests for the factorization object in the coming days.

sethaxen avatar Oct 04 '22 22:10 sethaxen

Cool! Haven‘t checked the rendered docs. but the rest looks good so far. Sure as long as the new type is still a little fragile, lets keep it internal.

kellertuer avatar Oct 05 '22 05:10 kellertuer

By the way I think this also fixes #463.

kellertuer avatar Oct 05 '22 15:10 kellertuer

I just noticed that the retraction is only implemented for this new metric. Since retractions are defined just using the differential, they are actually retraction independent of which metric you have in mind. Could we adapt that and also put the retraction into the first section of the documentation page (common and metric independent functions)?

I also saw the warning and maybe that early in the doctoring is a little too early and we could only mention that near the end?

kellertuer avatar Oct 10 '22 05:10 kellertuer

I just noticed that the retraction is only implemented for this new metric. Since retractions are defined just using the differential, they are actually retraction independent of which metric you have in mind. Could we adapt that and also put the retraction into the first section of the documentation page (common and metric independent functions)?

Which retraction do you mean? The only retraction this PR defines is exp!, which is dependent on this metric.

I also saw the warning and maybe that early in the doctoring is a little too early and we could only mention that near the end?

Which warning? What do you mean by "doctoring"?

sethaxen avatar Oct 10 '22 05:10 sethaxen

I am speaking about this

https://juliamanifolds.github.io/Manifolds.jl/previews/PR535/manifolds/stiefel.html#ManifoldsBase.inverse_retract-Tuple{MetricManifold{ℝ,%20var%22#s113%22,%20var%22#s112%22}%20where%20{var%22#s113%22%3C:Stiefel,%20var%22#s112%22%3C:StiefelSubmersionMetric},%20Any,%20Any,%20ProjectionInverseRetraction}

retraction (or its inverse) and this

https://juliamanifolds.github.io/Manifolds.jl/previews/PR535/manifolds/stiefel.html#ManifoldsBase.inverse_retract-Tuple{MetricManifold{ℝ,%20var%22#s113%22,%20var%22#s112%22}%20where%20{var%22#s113%22%3C:Stiefel,%20var%22#s112%22%3C:StiefelSubmersionMetric},%20Any,%20Any,%20ShootingInverseRetraction}

Both could be inverse retractions just for Stiefel

For the “doctoring” - sorry that was autocorrect (and before first coffee) and should have been “docstring” I mean

https://juliamanifolds.github.io/Manifolds.jl/previews/PR535/manifolds/stiefel.html#Manifolds.StiefelFactorization

the one in here. We could move that down

kellertuer avatar Oct 10 '22 08:10 kellertuer

Alternatively we could also do a Internal Data Structure Section in the very end of the Stiefel page where we put the doc string?

kellertuer avatar Oct 10 '22 09:10 kellertuer

I am speaking about this

https://juliamanifolds.github.io/Manifolds.jl/previews/PR535/manifolds/stiefel.html#ManifoldsBase.inverse_retract-Tuple{MetricManifold{ℝ,%20var%22#s113%22,%20var%22#s112%22}%20where%20{var%22#s113%22%3C:Stiefel,%20var%22#s112%22%3C:StiefelSubmersionMetric},%20Any,%20Any,%20ProjectionInverseRetraction}

retraction (or its inverse) and this

https://juliamanifolds.github.io/Manifolds.jl/previews/PR535/manifolds/stiefel.html#ManifoldsBase.inverse_retract-Tuple{MetricManifold{ℝ,%20var%22#s113%22,%20var%22#s112%22}%20where%20{var%22#s113%22%3C:Stiefel,%20var%22#s112%22%3C:StiefelSubmersionMetric},%20Any,%20Any,%20ShootingInverseRetraction}

Both could be inverse retractions just for Stiefel

Yes, certainly, these can be generalized. Although, depending on how the forwarding works, I may still also need to implement them for this metric.

Alternatively we could also do a Internal Data Structure Section in the very end of the Stiefel page where we put the doc string?

Yes, I think I prefer this approach, since right now it is even listed before the metric itself.

sethaxen avatar Oct 10 '22 10:10 sethaxen

Yes, certainly, these can be generalized. Although, depending on how the forwarding works, I may still also need to implement them for this metric.

usually you would implement something like retract_project then it works fine. For the second

https://github.com/JuliaManifolds/ManifoldsBase.jl/blob/master/src/shooting.jl#L56

works fine, then the traits system is properly taken into account (no need to reimplement it for the specific metric)

Alternatively we could also do a Internal Data Structure Section in the very end of the Stiefel page where we put the doc string?

Yes, I think I prefer this approach, since right now it is even listed before the metric itself.

Perfect that would also be my preferred way here

kellertuer avatar Oct 10 '22 11:10 kellertuer