pints icon indicating copy to clipboard operation
pints copied to clipboard

I1083 l bfgs optimiser

Open alisterde opened this issue 4 years ago • 9 comments

Hi @ben18785 and @MichaelClerx ,

I’ve spent the last 2 and a bit weeks looking at the BFGS and L-BFGS/ LM-BFGS algorithms for issue #1083 . The L-BFGS part is implement - if you you use a very large value of 'm' (the number of correction vectors for the inverse hessian stored) it's the same as the BFGS. I would appreciate it if you could check if I've got the Hessian update correct. There are two implementations one using scipy line search algorithms (BFGS_scipy) and another using the Hager-Zhang algorithm.

The optimiser isn't working yet, and I'm fairly sure the problem is with the line search, I’m having a lot of trouble with the line search component, as the scipy line search fails to find an acceptable step size form the very first step for the toy logistic model and the Hager-Zhang doesn't find an acceptable size either. Though as both of them have this issue it maybe the initialisation of the first inverse hessian and newton direction.

I’ve tried writing up the Hanger -Zang algorithm slightly adjusted for ask tell, I can’t quite see where I’m going wrong. I’ve checked it against an implementation in tensor flow (https://github.com/tensorflow/probability/blob/v0.10.0/tensorflow_probability/python/optimizer/linesearch/hager_zhang.py) and they seem to be preforming the same function, at the same time, in the same order as me. Stan and scipy don’t use this algorithm, they use an older method called the More-Thuente line search. The Hanger-Zhang algorithm has been shown to preform better.

I’ve also tried using the line search algorithms in scipy, but I'm having issues. Scipy doesn’t use them for their lm-bfgs - they wrap FORTRAN code. Also very few people actually seem to have implemented the BFGS alogrithms most wrap FORTRAN libraries.

Anyway, I’d appreciate it if you could check the BFGS part of what I’ve written (it’s in the tell() function of either implementation, though BFGS_scipy is clearer, and updates the hessian). Also please let me know if you have any ideas about line search.

When this is done I was thinking the majority of code would go in the abstract class LineSearchBasedOptimiser, as the only difference between BFGS, DFP, and conjugate gradient optimisation is the approximation of the inverse hessian.

I’m not going to be working on this for the next few weeks as I need to write up my project report (I have been working too much on pints have neglected it).

Best, Alister

alisterde avatar May 24 '20 17:05 alisterde

Codecov Report

Merging #1149 (4369b39) into master (a06ab88) will decrease coverage by 3.13%. The diff coverage is 11.28%.

Impacted file tree graph

@@             Coverage Diff             @@
##            master    #1149      +/-   ##
===========================================
- Coverage   100.00%   96.86%   -3.14%     
===========================================
  Files           71       84      +13     
  Lines         7543     9050    +1507     
===========================================
+ Hits          7543     8766    +1223     
- Misses           0      284     +284     
Impacted Files Coverage Δ
pints/_optimisers/__init__.py 60.12% <8.79%> (-39.88%) :arrow_down:
pints/_optimisers/_lbfgs.py 24.44% <24.44%> (ø)
pints/__init__.py 100.00% <100.00%> (ø)
pints/_util.py 100.00% <0.00%> (ø)
pints/noise.py 100.00% <0.00%> (ø)
pints/_logger.py 100.00% <0.00%> (ø)
pints/_log_pdfs.py 100.00% <0.00%> (ø)
pints/_log_priors.py 100.00% <0.00%> (ø)
pints/_diagnostics.py 100.00% <0.00%> (ø)
... and 31 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update a06ab88...4369b39. Read the comment docs.

codecov[bot] avatar May 24 '20 17:05 codecov[bot]

I've made a rookie mistake and used pythons 3 feature of specifying what type a variable should be in a function, Python 2 doesn't support it I'll fix before I next commit. And Docs are failing as I haven't included the added files and functions in the rst files yet (not sure which implementation we will eventually keep).

alisterde avatar May 24 '20 18:05 alisterde

Thanks @alisterde ! Looks like a huge amount of work has already gone into this. I'll try to carve out some time to look through this in detail, within the next few weeks. So it's quite ideal that you'll be busy with something else, from my point of view :sweat_smile:

One thing that struck me when you described the hard to find bug: Could this be an issue of scale or tuning? Many methods don't seem to work at all unless you tune them right, e.g. by picking the correct step sizes. If you were testing on the logistic problem you may also have noticed that it's got massive differences in the parameter magnitudes, which can throw some methods; so best to start on e.g. an almost symmetrical parabola in 2d param space.

MichaelClerx avatar May 26 '20 15:05 MichaelClerx

Wondering if we should split the line search parts off into a separate PR?

MichaelClerx avatar Jun 02 '20 09:06 MichaelClerx

This looks like it could be useful http://www.caam.rice.edu/~yzhang/caam554/pdf/cgsurvey.pdf

MichaelClerx avatar Jun 02 '20 09:06 MichaelClerx

Thanks @MichaelClerx and @ben18785, It could definitely be an issue of scale or tuning I hadn't thought of that I will start by trying this!

alisterde avatar Jun 02 '20 11:06 alisterde

Hi @alisterde ! I'd like to pick this up again at some point, as I think it's a really good basis for further optimisation work :-) Is this PR up to date with the latest stuff you did, or are there still some uncommitted things you've worked on?

MichaelClerx avatar Nov 03 '20 09:11 MichaelClerx

Hi Michael,

I think there might be some uncommitted changes, I'll check and get back to you. I am/was intending to pick this up again as well!

Best, Alister


From: Michael Clerx [email protected] Sent: 03 November 2020 09:35 To: pints-team/pints [email protected] Cc: Alister Dale-Evans [email protected]; Mention [email protected] Subject: Re: [pints-team/pints] I1083 l bfgs optimiser (#1149)

Hi @alisterdehttps://github.com/alisterde ! I'd like to pick this up again at some point, as I think it's a really good basis for further optimisation work :-) Is this PR up to date with the latest stuff you did, or are there still some uncommitted things you've worked on?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/pints-team/pints/pull/1149#issuecomment-721005163, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANR4JGTWBXHC5RJQZYBSFGTSN7FEPANCNFSM4NI7HSOA.

alisterde avatar Nov 03 '20 11:11 alisterde

Hi @MichaelClerx,

I've taken another look at this and done the following:

  • Split the line search and LBFGS optimiser so that the line search can be used with other gradient based optimisers.
  • Addressed the testing areas cased by using some Python 3 only features
  • Added a basic example notebook
  • Added a stopping criteria so that any line search based algorithm will terminate when the gradients at the best fit point are equal to zero (<1e-6).

The line search algorithm needs improvement to better fit into the ask/tell framework. I've left a print statement in that tells the user when the Hessian has been updated otherwise it looks very like the algorithm is stuck when it isn't, I would like to know you're thoughts on this. There's a large commented out function at the bottom of the line search class called __secant2 this function was moved to be within ask and pass to tell at key points however I've left it for you to see as I think it's easier to understand written in this function, but I don't mind getting rid of it.

I have not added tests yet. The implementation seems quite slow. It will probable benefit from the transformation features in the most resent pints.

Tests are all passing in my per-commit checks but appear to be failing on committing due to an import problem for the 'curve_fit' functions in Python 2.7, I haven't been able to work put how to fix this.

Let me know how you want to proceed, Alister

alisterde avatar Nov 08 '20 17:11 alisterde