pymanopt icon indicating copy to clipboard operation
pymanopt copied to clipboard

Best Practices for Custom Gradients

Open captain-pool opened this issue 2 years ago • 5 comments

I've been trying to define custom egrad and ehess, however, I can't seem to find the right way to do it. Here are the things I tried:

  1. https://github.com/pymanopt/pymanopt/issues/38#issuecomment-319102003: Tried doing this. Didn't work. I got an error here: https://github.com/pymanopt/pymanopt/blob/df94ab9e03b5fa3041668defe995d93b8715a6d7/pymanopt/core/problem.py#L105-L109

  2. Upon decorating the function with pymanopt.function.PyTorch I got an error here: https://github.com/pymanopt/pymanopt/blob/df94ab9e03b5fa3041668defe995d93b8715a6d7/pymanopt/core/problem.py#L138-L140 saying the return type is "not expected" as it only accepts in a list or a tuple.

  3. Hence, I tried returning a list of Pytorch Tensors instead. which raised an error here: https://github.com/pymanopt/pymanopt/blob/df94ab9e03b5fa3041668defe995d93b8715a6d7/pymanopt/autodiff/backends/_pytorch.py#L47 saying it cannot find a property .numpy() for a list as one would expect.

  4. Next I made a pymanopt.autodiff.Function with the backend as _CallableBackend for both the custom_ehess and custom_egrad functions. Here I returned a list of numpy arrays from the custom_egrad and custom_ehess functions. The code finally ran, however, neither the cost, nor the |grad| was changing with the TrustRegions solver for all the steps. I ran multiple different initializations wondering maybe it has to do bad initializations, however, it stayed this way in all of these experiments.

  5. Finally, I monkey-patched the _ehess and _egrad parameters of pymanopt.Problem as follows:

problem = pymanopt.Problem(manifold=..., cost=...)
problem._egrad = custom_egrad
problem._ehess = custom_ehess

this time, I was a list of numpy arrays from custom_egrad and custom_ehess.

Result: Like in 4. the cost and the |grad| stayed the exact same for all the steps in all the 10+ experiments with different initializations.

I reconfirmed the initlalizations is not working, by comparing the frobenius norm of the difference between initialization and result. It has a negligible change (~ 1e-5)

The problem is defined correctly, since it converges to desired results if I don't use custom egrad and ehess. However, I need to do customization for parallelization purposes. As an extra step, I removed all parallelization code, to see if the value changes with custom gradient and hessian. They don't.

So, clearly, I'm doing something wrong here. What's the correct way to define custom gradients and hvps?

Thanks.

UPDATE: 6. Tried using the Callable backend like shown here: https://github.com/pymanopt/pymanopt/blob/df94ab9e03b5fa3041668defe995d93b8715a6d7/examples/advanced/check_gradient.py#L26-L32 The code runs, but, much like 4. and 5., the cost value and |grad| is not updating between steps.

captain-pool avatar Mar 04 '22 22:03 captain-pool

CC: @j-towns @NicolasBoumal

captain-pool avatar Mar 04 '22 22:03 captain-pool

Thanks for the detailed description of your attempts so far. The following example shows one way of defining egrad directly: https://github.com/pymanopt/pymanopt/blob/master/examples/dominant_invariant_subspace.py If that does help, could you post a minimal working code where this issue comes up; something we could run on our end?

NicolasBoumal avatar Mar 06 '22 13:03 NicolasBoumal

Hi @NicolasBoumal, after some more debugging, I found that setting use_rand = False for TrustRegions works with the custom differentials. Apparently, it was not updating still since the solver was rejecting all the updates. It is however very confusing, why use_rand=True works with default differentials, but not with custom differentials.

captain-pool avatar Mar 06 '22 16:03 captain-pool

Indeed, that is something to look into. Thanks for reporting it!

NicolasBoumal avatar Mar 07 '22 06:03 NicolasBoumal

Sorry for getting back to you on this just now. Overwriting internal properties should really not be the way to provide custom gradient/Hessian maps. Decorating a callable with the pymanopt.function.numpy decorator is all that should be necessary. If you could provide a minimal working example, we could look into this a little further :pray:

nkoep avatar May 28 '22 11:05 nkoep