theseus icon indicating copy to clipboard operation
theseus copied to clipboard

Dogleg nonlinear optimizer

Open luisenp opened this issue 4 years ago • 17 comments

luisenp avatar Dec 07 '21 16:12 luisenp

@luisenp Is this issue still open ?

jeffin07 avatar May 05 '22 14:05 jeffin07

Hi @jeffin07, yes this is open!

luisenp avatar May 05 '22 16:05 luisenp

@luisenp Can you please guide me. I would like to work on it

jeffin07 avatar May 05 '22 18:05 jeffin07

Awesome! The idea would be to implement this optimization method.

As a guide, you can use the code for Gauss-Newton. This function computes \delta_{gn} in the link above. Note that this delta is batched, so the shape would be something like batch_size x total_error_dim.

To compute delta_{sd}, you need access to the linearization A and b matrices (J and f(x), in the wiki notation,respectively). This is easy to do when using the dense solvers, since you will have access to self.linear_solver.linearization.Atb. However, when using sparse solvers we might need to add some other functionality to access this (cc @maurimo). But we can start with the dense solver in the mean time.

Note that all of this has to be differentiable, so you'll probably have to use torch.where carefully when computing the final step size (which might be different for each element in the batch).

Let me know if you have follow up questions.

luisenp avatar May 06 '22 16:05 luisenp

So my understanding is only about accessing the linearization from the different solver classes. I think this should be quite straightforward, if there is no unified way to do so then this is a good opportunity to enforce that there should indeed be one! The sparse linearization will provide different representations for A,b, but it's indeed an alternative case that can be handled separately.

maurimo avatar May 06 '22 17:05 maurimo

Hi @jeffin07, following up on the status of this issue. Let us know if you have more questions.

mhmukadam avatar May 31 '22 19:05 mhmukadam

@mhmukadam Sorry i was not able to work on this,was a little busy. I will let u know if i encounter any questions

jeffin07 avatar Jun 04 '22 05:06 jeffin07

@luisenp @mhmukadam Guass_Netwon in the compute_delta function it just call the self.linear_solver.solve(). I can't find the exact computations for the \delta_{gn} so i'm a little confused how to proceed. Also for the optimizer i need to calculate delta_{sd}. So i have to do delta_{sd}] in the compute_delta function right ? Sorry if i'm asking too many questions

jeffin07 avatar Jun 06 '22 17:06 jeffin07

Hi @jeffin07, that's correct, delta_gn is just delta_gn = self.linear_solver.solve(). For the easiest example of how the linear solvers do this computation, you can check here, and the _solve_system method of its subclasses (in the same file).

That's correct, in the dodleg optimizer the return value would be delta_sd.

luisenp avatar Jun 06 '22 20:06 luisenp

@luisenp Will the compute_delta function for dogleg optimizer returns self.linear_solver.linearization.Ata * self.linear_solver.linearization.Atb , Something like this ?

jeffin07 avatar Jun 20 '22 15:06 jeffin07

Hi @jeffin07, I'm really sorry, somehow I missed this comment until today. I also noticed my previous comment was not entirely accurate, the return value should follow this:

image

luisenp avatar Jun 30 '22 20:06 luisenp

Hi @jeffin07. I was wondering if you were blocked on this and if there was anything we could do to help. In case it's useful, the dogleg implementation of GTSAM is a good reference to look at (in this folder).

luisenp avatar Aug 03 '22 15:08 luisenp

@luisenp Sorry for the late reply.I missed the comments. Thank you for the reference. I will take a look and update you on this ASAP. Again Sorry for the late reply

jeffin07 avatar Aug 03 '22 16:08 jeffin07

@luisenp we need to calculate the delta_sd=-J'F(x), here J and F(x) which can be accessed using self.linear_solver.linearization.Atb. When looking at this link link our compute_delta function will need 3 inputs .What confuses me is that in GaussNewton we're just calling self.linear_solver.solve() to calculate delta_gn and no inputs are passed although it uses J and F(x) matrices. Can you help me to understand this ?

jeffin07 avatar Aug 09 '22 17:08 jeffin07

In their code, the input delta refers to the trust region radius, which we can probably keep as a member variable (of type tensor, for handling batches)*. The other two quantities can be directly computed from the linearization, so, as far as I can tell, we don't need to have any inputs. The sd direction we can get as you suggested (what they label dx_u), while the gn direction we can get with self.linear_solver.solve() (what they label dx_n).

*We might need an optimizer.reset() method if we do this, to make sure we don't keep trust regions between different optimizer calls.

luisenp avatar Aug 11 '22 13:08 luisenp

Hi @jeffin07, any updates/timeline on this? We are targeting 0.1.1 release in the next week or two.

mhmukadam avatar Sep 08 '22 15:09 mhmukadam

@mhmukadam Sorry I was really caught up at work. May be you can assign others

jeffin07 avatar Sep 08 '22 16:09 jeffin07