levenberg-marquardt icon indicating copy to clipboard operation
levenberg-marquardt copied to clipboard

[v1.1] Add support for errors-in-variables

Open cshaa opened this issue 6 years ago • 20 comments

This pull request resolves #17 by adding support for regression with known standard deviations of data. This can be done with the optional parameters xError and yError on the data object passed to the algorithm. Furthermore I implemented adaptive fitting by changing the lambda parameter. I tried as hard as I could to maintain the good taste of the code, but I had to disable some overly restrictive eslint rules.

I've done these changes:

  1. I've changed minValue and maxValue to minValues and maxValues respectively. This way they're more consistent with the initialValues option. I've exposed these options in jsDoc, which fixes #15.
  2. Then I changed the property parameterError on the returned object to residuals which is a much better name as I describe in #18. However, that's a breaking change – this needs to be taken care of.
  3. I changed the behavior of errorTolerance which fixes #18.
  4. I've implemented the adaptive changes of damping parameter. The new options are:
  • dampingDrop – if the fit got better, multiply damping by this factor, default value: 0.1
  • dampingBoost – if the fit got worse, multiply damping by this factor, default value: 1.5
  • minDamping and maxDamping – limits of the damping parameter
  1. I've added support for EIV (errors in variables). They can be passed as data.xError and data.yError to the algorithm. The new options are:
  • errorPropagation – how many iterations are used to approximate the error propagation through the parametrized function, default value: 50
    • Previously there have been rough and fine options to distinguish between rough guesses far from minimum and more precise estimations near minimum. But because I wasn't able to avoid getting stuck in local minima because of suboptimal "switching strategy", I decided to drop this distinction.

Both the old and my improved version of this algorithm are quite prone to local minima. My thoughts were that if we did a rough check of the parameter space (computing the residuals for various parameters) before the actual fitting, it could improve the results. But I'd better find some paper on this beforehand.

obrazek

A few things that need to be done before merging:

  • [ ] Add more tests for the new code
  • [x] Add a "legacy layer" which would avert the breaking changes
  • [x] Fine-tune the damping parameter
  • [x] Find a new name for errorTolerance
  • [x] Compare the weighted and non-weighted versions of this algorithm
  • [x] ~~Find a way to smoothly transition between errorPropagation.rough and errorPropagation.fine~~ (I decided to drop this API)
  • [x] ~~Find a way to avoid local minima~~ (Probably not in the scope of this PR)

cshaa avatar Sep 08 '18 11:09 cshaa

Codecov Report

Merging #19 into master will decrease coverage by 6.83%. The diff coverage is 86.48%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #19      +/-   ##
==========================================
- Coverage   98.63%   91.79%   -6.84%     
==========================================
  Files           3        3              
  Lines          73      134      +61     
  Branches       20       29       +9     
==========================================
+ Hits           72      123      +51     
- Misses          1       10       +9     
- Partials        0        1       +1
Impacted Files Coverage Δ
src/index.js 98% <100%> (+0.63%) :arrow_up:
src/step.js 100% <100%> (ø) :arrow_up:
src/errorCalculation.js 81.48% <79.59%> (-18.52%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update afbbd90...a3d2b2d. Read the comment docs.

codecov[bot] avatar Sep 08 '18 11:09 codecov[bot]

Hello @m93a many thanks for your interest in this package, before everything, have you some literature where I could check the theoretical meaning of your suggestions?

jobo322 avatar Sep 08 '18 21:09 jobo322

@jobo322 Hey, My primary sources were these: Numerical Recipes in FORTRAN 77, Chapter 15 Wikipedia: Levenberg-Marquardt Damping–undamping strategies for the Levenberg–Marquardt nonlinear least-squares method Wikipedia: Nonlinear regression Plus some basic knowledge of statistics from university and a lot of googling.

If you have doubts about any specific claim, I'll try to provide more specific sources 🙂

cshaa avatar Sep 09 '18 02:09 cshaa

Hello @m93a, I will checking the code as soon as possible, also, I'll review the test cases. Again many thanks for your collaboration

jobo322 avatar Sep 09 '18 15:09 jobo322

Now the only thing left is to add new tests and evaluate why the old ones fail.

cshaa avatar Sep 24 '18 20:09 cshaa

Hello @m93a , I'm looking for an explanation about why the parameters in test case of sinFunction is always going down, I have been trying with diferents dampingBoost, dampingDrop, damping but the result is the min coutes defined by minValues, if you do not set inital values close to the real parameters, do you have idea how to control it?

jobo322 avatar Oct 03 '18 14:10 jobo322

Hey @jobo322, Sadly I don't know yet. The first thing to do would be to make a 3D plot of the function sumOfSquaredResiduals(a, b) and explore the parameter space of this particular dataset. It is quite possible that the local minima are very shallow and getting lower with higher frequency. In such case, the algorithm would be performing as expected and an additional technique would be needed to find the fundamental frequency. (Our desired solution would be the local minimum, which of all periodically reocurring local minima has the lowest frequency.) If you don't manage to inspect it and reach a conclusion, I'll take a look at it as soon as I have some free time for that (which should be in a month at most).

EDIT: It would be really helpful to make a plot with the algorithm's “trajectory” in the parameter space, rendered on top of a heatmap of that space.

cshaa avatar Oct 04 '18 14:10 cshaa

Hello @m93a and thanks a lot for your work!

I hope you don't mind, I pushed a change to your branch to fix the conflicts with master and outstanding issues with eslint. In case you haven't seen it, we have an eslint-fix npm script to automatically fix styling issues so there shouldn't be a need to disable rules.

targos avatar Oct 18 '18 09:10 targos

I don't think we need to support legacy options. Let's design a good API and get rid of the inconsistencies.

targos avatar Oct 18 '18 09:10 targos

FYI I am taking a stab at this also. https://github.com/jacobq/levenberg-marquardt/tree/support-error

jacobq avatar Oct 19 '18 16:10 jacobq

@jobo322 @m93a Regarding the question about divergence in the Math.sin test, this family of functions seems to have a deep canyon near the solution but a broad plateau for low amplitude parameters. (I am using the terms like the paper "Why are Nonlinear Fits to Data so Challenging?" -- not sure if they are standard.) In other words, when the amplitude parameter is small, it is very hard to find the optimal frequency parameter because going the "wrong way" may make the solution "better".

Here's a 2D plot showing how SSR varies with the frequency parameter when the amplitude is held correct: image

Here's a surface plot you can explore on jsbin.com.

(In these examples I used amplitude = 2 and frequency = 3)

jacobq avatar Oct 22 '18 20:10 jacobq

You can also see how holding the frequency constant could lead toward the sub-optimal solution of zero amplitude: image

jacobq avatar Oct 22 '18 20:10 jacobq

Since there is a lot of work in this PR I'd like to break it down into smaller bites (each as their own commit), and possibly even merge them separately. Thoughts? (I don't mind doing the rebasing myself, if it's helpful.)

  • [x] Refactor / update test suite (there is a lot of duplication, and it's often hard to tell what is desired behavior vs. existing behavior) https://github.com/mljs/levenberg-marquardt/pull/26
  • [ ] Replace parameterError with sumOfSquaredResiduals (breaking change)
  • [ ] Support dynamic damping (breaking change)
  • [ ] Support weighting/error (breaking change)

jacobq avatar Oct 22 '18 20:10 jacobq

I'd like to break it down into smaller bites

I don't have an opinion on this, we'd better ask @targos 🙂


Should I push all the minor fixes we talked about before you start rebasing? Or will you take care of it yourself? I'm talking specifically:

  • correct preparea to prepare in package.json
  • roll package version back
  • rename params2 and residuals2 to nextParams and nextResiduals
  • refactor the code with residualDifferences = Array(10) to something reasonable
  • change sq(x) to x ** 2 or Math.pow

cshaa avatar Oct 23 '18 23:10 cshaa

Should I push all the minor fixes we talked about before you start rebasing? Or will you take care of it yourself? I'm talking specifically: ...

Yes, please. I plan to work on other branches and cherry-pick your commits, so feel free to keep working here. (I'm also not sure how much time I'll have available to work on this over the next 2 weeks.)

jacobq avatar Oct 24 '18 00:10 jacobq

@m93a just checking to see if you have done any more work related to this. I am planning to get back to this in the next week or two.

jacobq avatar Dec 17 '18 17:12 jacobq

Sorry, I didn't have much time because of school and the other project that I'm working on (which will need this repo as a dependency but we're not quite there yet). I'm not sure if I'll have spare time sooner than a month from now, so feel free to work on this until then 🙂

cshaa avatar Dec 19 '18 13:12 cshaa

I'm very curious of any progress on this proposed upgrade of the algorithm. I've used this one in C/C++, which has all the features one needs: https://pages.physics.wisc.edu/~craigm/idl/cmpfit.html

Maybe you can get some help figuring out your problem by looking into this implementation.

trulstrengerikkealias avatar Jan 16 '20 21:01 trulstrengerikkealias

I'm very curious of any progress on this proposed upgrade of the algorithm.

You know, I meant to get around to this but never did. Thanks for the reminder and the link. @m93a did you / do you plan to work more on this any time soon?

jacobq avatar Jan 16 '20 22:01 jacobq

Hey, I kinda forgot about this PR too. I might get into it after the end of our exam period at some time in February. If I forget again, please do bump the issue :)

cshaa avatar Jan 17 '20 21:01 cshaa