pymanopt
pymanopt copied to clipboard
Best Practices for Custom Gradients
I've been trying to define custom egrad
and ehess
, however, I can't seem to find the right way to do it. Here are the things I tried:
-
https://github.com/pymanopt/pymanopt/issues/38#issuecomment-319102003: Tried doing this. Didn't work. I got an error here: https://github.com/pymanopt/pymanopt/blob/df94ab9e03b5fa3041668defe995d93b8715a6d7/pymanopt/core/problem.py#L105-L109
-
Upon decorating the function with
pymanopt.function.PyTorch
I got an error here: https://github.com/pymanopt/pymanopt/blob/df94ab9e03b5fa3041668defe995d93b8715a6d7/pymanopt/core/problem.py#L138-L140 saying the return type is "not expected" as it only accepts in alist
or atuple
. -
Hence, I tried returning a list of Pytorch Tensors instead. which raised an error here: https://github.com/pymanopt/pymanopt/blob/df94ab9e03b5fa3041668defe995d93b8715a6d7/pymanopt/autodiff/backends/_pytorch.py#L47 saying it cannot find a property
.numpy()
for a list as one would expect. -
Next I made a
pymanopt.autodiff.Function
with the backend as_CallableBackend
for both thecustom_ehess
andcustom_egrad
functions. Here I returned a list of numpy arrays from thecustom_egrad
andcustom_ehess
functions. The code finally ran, however, neither thecost
, nor the|grad|
was changing with theTrustRegions
solver for all the steps. I ran multiple different initializations wondering maybe it has to do bad initializations, however, it stayed this way in all of these experiments. -
Finally, I monkey-patched the
_ehess
and_egrad
parameters ofpymanopt.Problem
as follows:
problem = pymanopt.Problem(manifold=..., cost=...)
problem._egrad = custom_egrad
problem._ehess = custom_ehess
this time, I was a list of numpy arrays from custom_egrad
and custom_ehess
.
Result: Like in 4. the cost
and the |grad|
stayed the exact same for all the steps in all the 10+ experiments with different initializations.
I reconfirmed the initlalizations is not working, by comparing the frobenius norm of the difference between initialization and result. It has a negligible change (~ 1e-5)
The problem is defined correctly, since it converges to desired results if I don't use custom egrad
and ehess
. However, I need to do customization for parallelization purposes. As an extra step, I removed all parallelization code, to see if the value changes with custom gradient and hessian. They don't.
So, clearly, I'm doing something wrong here. What's the correct way to define custom gradients and hvps?
Thanks.
UPDATE:
6. Tried using the Callable
backend like shown here: https://github.com/pymanopt/pymanopt/blob/df94ab9e03b5fa3041668defe995d93b8715a6d7/examples/advanced/check_gradient.py#L26-L32 The code runs, but, much like 4. and 5., the cost value and |grad|
is not updating between steps.
CC: @j-towns @NicolasBoumal
Thanks for the detailed description of your attempts so far. The following example shows one way of defining egrad directly: https://github.com/pymanopt/pymanopt/blob/master/examples/dominant_invariant_subspace.py If that does help, could you post a minimal working code where this issue comes up; something we could run on our end?
Hi @NicolasBoumal, after some more debugging, I found that setting use_rand = False
for TrustRegions works with the custom differentials. Apparently, it was not updating still since the solver was rejecting all the updates. It is however very confusing, why use_rand=True
works with default differentials, but not with custom differentials.
Indeed, that is something to look into. Thanks for reporting it!
Sorry for getting back to you on this just now. Overwriting internal properties should really not be the way to provide custom gradient/Hessian maps. Decorating a callable with the pymanopt.function.numpy
decorator is all that should be necessary. If you could provide a minimal working example, we could look into this a little further :pray: