theseus icon indicating copy to clipboard operation
theseus copied to clipboard

Can theseus return negative cost?

Open Jeff09 opened this issue 1 year ago • 13 comments

❓ Questions and Help

Hi theseus team,

I'm currently using theseus to solve some non-linear optimization problem. In my case, some cost function will return negative cost. However, in the objective, it'll default calculate the square of cost, making the negative gradient direction become the positive one. I wonder if there's any methods to handle this case.

Thank you so much.

Jeff09 avatar Apr 27 '23 00:04 Jeff09

Hi @Jeff09. Most of our optimizations methods assume a nonlinear least squares formulation, and won't work with a different cost function aggregation metric. That being said, you can try our differentiable CEM solver, which supports other error metrics. You can change the default sum of squares by passing a different value here.

Let me know if this helps.

luisenp avatar Apr 28 '23 13:04 luisenp

Hi @luisenp , Thank you for your advice.

I have tried to change the default sum of squares to just sum of error vector. I can get the negative cost by using the sum vector function. However, it seems the optimizer is not working as it should be. The optimizer is not decreasing the cost but increasing the cost. By the way, I'm using LevenbergMarquardt optimizer. Here's the error example. This's the error printed in the 0th iteration. image

The error after 1st iteration is following. image image

I have two questions regarding this.

  1. Why is the optimizer not working as lowering the cost?
  2. Isn't the printed error the sum of error vector when the cost weight is set to 1?

Thank you so much.

Jeff09 avatar Apr 29 '23 18:04 Jeff09

As I mentioned above, these optimizers are meant for nonlinear sum of squares problems. The only optimizer in our library that can handle other types of metrics is DCEM.

BTW, you should have gotten this error when you tried to use LevenbergMarquardt. Did this not happen?

luisenp avatar Apr 29 '23 18:04 luisenp

There's not error happened when I tried LevenbergMarquardt .

I have tried DCEM and it looks not optimization happens. It only shows the 0th iteration error and not any further iterations error printed even though I have already set verbose=True.

Here's the code I tried DCEM.

th.TheseusLayer( optimizer=th.DCEM( self.objective, linear_solver_cls=th.CholmodSparseSolver, vectorize=False, max_iterations=max_iter, step_size=step_size, ),

Jeff09 avatar Apr 29 '23 19:04 Jeff09

Are you on the latest version of Theseus? You are not getting any output at all? Also Also, note that the keywords for DCEM are different:

image

Can you share a short example that reproduces this behavior?

luisenp avatar May 01 '23 12:05 luisenp

@luisenp Here's the output using DCEM. image

It only has 2 steps optimization when I change the error_metric_fn from the default metric to error_sum_fn in the th.Objective.

Jeff09 avatar May 02 '23 23:05 Jeff09

Ah, there was a bug in DCEM affecting negative costs. Fixed in #510. Can you check if with this fix it works for your use case now? Thanks for reporting this!

luisenp avatar May 03 '23 09:05 luisenp

Hi @luisenp, thanks for the quick fix. It works now and can generate some optimization after 50 iterations. However, it looks not easy to find a good solution after i have tried different hyper parameters. Here's the sample err. image

It looks the error variance is too large to converge. Could you give me some advices on how to use DCEM?

Jeff09 avatar May 04 '23 00:05 Jeff09

Are your cost functions bounded below?

luisenp avatar May 04 '23 01:05 luisenp

Perhaps @dishank-b @bamos can also provides some tips for using DCEM.

mhmukadam avatar May 04 '23 14:05 mhmukadam

@Jeff09 the error seems to be still going down, can you just try with higher max_iterations? You can try with higher n_elite as well. Also make sure init_sigma is big enough that required solution is within +2*sigma of initialize value of variables.

dishank-b avatar May 04 '23 14:05 dishank-b

Thank you for your tips using DCEM.

I have increase max_iterations from 50 to 100 and also using n_elite=10, init_sigma=5.0. However the error does not looks like having the right optimization direction. In the first few iterations, the error increases a lot and then decreasing. At the end, it looks not have much optimizations. When going to next point, the 0th iteration has way too much error.
image

Jeff09 avatar May 05 '23 18:05 Jeff09

@Jeff09 The error seems to be converging, even if slowly. DCEM is a random method, so it's not surprising that the error can increase between iterations; but by the 100th iteration your image shows that it's definitely much lower than the initial error. You should play with the hyperparameters a bit to see what works best for your application. Lowering n_elite might help converge faster, at the cost of potentially worse solution quality. (@dishank-b is this correct?) If you are not concerned about run time, maybe you should also increase n_sample, specially if your problem has a lot of optimization variables.

Now, regarding the comment of going to the next point, the expected behavior between different calls to the optimizer is application specific. Some questions I would consider:

  • Are the parameters of the objective the same as before? If not, what changes?
  • Are the solutions for two consecutive points expected to be related in some way?
  • Are you passing new initial values for the optimization variables?
  • Are you using the previous solution to initialize the optimization variables in the next iteration?

Overall, countess parameters can affect what happens between different calls to the optimizer, and unfortunately there is really not much we can advise without knowing more details about your application.

luisenp avatar May 05 '23 19:05 luisenp