stdlib
stdlib copied to clipboard
Optimization, Root finding, and Equation Solvers
I'm willing to spearhead a module or set of modules containing:
- general constrained and unconstrained nonlinear optimization (sqp, global optimization, etc.)
- nonlinear equation solving (newton, least squares, etc.)
- scalar minimizers and root solvers (fmin, zeroin, etc.)
I have already modernized some classic codes and there are also some others with permissive licenses floating around. I am willing to merge mine into the stdlib. I need to gather then together and provide a nice interface so that a user can call any of the solvers in a similar way. We can have discussion on what that interface would look like. Here are some example codes:
- slsqp (SQP)
- pikaia (genetic algorithm)
- Brent codes zeroin and fmin
- Minpack
- Other derivative-free bracketed root solvers such as bisection, Anderson-Bjorck, Muller, Pegasus, ...
- Basic Newton and/or Broyden solvers
- FilterSD
Eventually we could even provide (optional) integrations with larger third-party and/or commercial libraries such as IPOPT and/or SNOPT
See also
Note that some of these in various Python packages are just wrappers to Fortran 77 codes (such as slsqp).
Thank you! This would be super helpful.
Let's discuss the user facing public API.
On Sun, Jan 5, 2020, at 2:01 PM, Jacob Williams wrote:
I'm willing to spearhead a module or set of modules containing:
general constrained and unconstrained nonlinear optimization (sqp, global optimization, etc.)
nonlinear equation solving (newton, least squares, etc.)
scalar minimizers and root solvers (fmin, zeroin, etc.) I have already modernized some classic codes and there are also some others with permissive licenses floating around. I am willing to merge mine into the stdlib. I need to gather then together and provide a nice interface so that a user can call any of the solvers in a similar way. We can have discussion on what that interface would look like. Here are some example codes:
slsqp https://github.com/jacobwilliams/slsqp (SQP)
pikaia https://github.com/jacobwilliams/pikaia (genetic algorithm)
Brent codes zeroin and fmin https://github.com/jacobwilliams/Fortran-Astrodynamics-Toolkit/blob/653236e66950782262eb1c2538f56e120d67f83e/src/brent_module.f90
Other derivative-free bracketed root solvers such as bisection, Anderson-Bjorck, Muller, Pegasus, ...
Basic Newton and/or Broyden solvers
FilterSD https://projects.coin-or.org/filterSD Eventually we could even provide (optional) integrations with larger third-party and/or commercial libraries such as IPOPT and/or SNOPT
See also
Note that some of these in various Python packages are just wrappers to Fortran 77 codes (such as slsqp).
- Scipy.optimize https://docs.scipy.org/doc/scipy/reference/tutorial/optimize.html
- PyOpt http://www.pyopt.org/reference/optimizers.html
- Netlib opt https://www.netlib.org/opt/ — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/fortran-lang/stdlib/issues/87?email_source=notifications&email_token=AAAFAWGQYP5UOW6FAZVDHP3Q4JDCZA5CNFSM4KC5DBQ2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IEDE5IA, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAFAWGPY3KUNEQQFREEPSTQ4JDCZANCNFSM4KC5DBQQ.
I have refactored (ie pasta free) versions of Nodecal and Morales LBFGSB code, Turlach's quadratic programming code, and a couple of nature inspired algorithms (cuckoo search and Particle Swarm) optimizers I can contribute for optimization provided the original licenses for those codes are compatible. I know one is GPL2. I also have my versions of Brent's fmin and zeroin that I got from Frank Carmichaels PDAS site. He refactored the original FMM code towards F90.
Also, forgot that I have a version of the Nelder-Mead simplex algorithm I can contribute.
@jacobwilliams how about MATH77? Those are amazing pieces of code...
Concerning (optional) larger third-party libraries, I think FGSL deserves consideration even if there might be licence issues. Hopefully, they can be circumvented making its installation optional.
a set of functions one would find on a handheld scientific calculator in Fortran: https://github.com/scivision/rpn-calc-fortran#other-functions
@rweed Absolutely, we can use them! Although I think any GPL licensed code would be off limits since we don't want to make the whole library GPL, right?
@epagone Yep, we can check Math77 for anything we can use. That has a permissive license.
@scivision Is that for a different topic? I think we should have a function parser in stdlib
, that could be used there. I can make a new ticket for that.
The hard part and the main contribution of stdlib is to agree on a well designed API, that is approved, accepted and adopted by the wide Fortran community. The stdlib
library then provides both this specification of the API, as well as a reference implementation.
I agree the actual implementation can and should use well tested available code whenever possible (e.g. by porting the SciPy Fortran code into stdlib), instead of reinventing our own code. We should only write our own numerical code if there is no available permissive licensed code out there that would work. Or if we discover some bugs, or numerical issues (which can happen).
Yes, we can't use GPLed code, only permissive licenses like MIT or BSD.
My thinking is that it would be a high-level object oriented interface. Something like what I did for slsqp. I think this is the best way for modern codes of this type. The alternatives (forcing you to stuff your user data into giant work arrays, common blocks, or reverse communication, are unsatisfactory as far as I'm concerned).
There would be procedures for:
- initializing the class, send in all the input parameters (tolerances, etc.)
- solving the problem
- destroying the class
There would be methods a user would have to provide:
- the problem function (compute the objective function if present, and the constraints)
- the gradient function (compute the gradient of the objective function and the constraints) -- for those methods that require gradients.
- a function to report an iteration to the user
There would also be some mechanism for stopping the solve process (either within the problem or report function).
Other notes:
- some methods will have simple bounds (on the optimization variables), some may not.
- some methods may require a dense gradient matrix, and other may use a sparse one.
@jacobwilliams , In my code I usually (but not always - for old code its sometimes not practicle) use an empty abstract class (optimizers_t) and derive all of the base classes from that. I augment that with an abstract objective functions class. I think at a minimum if you go OO you will need something like these two classes.
Also, I could use a little guidance from those of you who are wise in the ways of software licenses. If I take a code written in another language (MATLAB and Python in this case) and completely rewrite it in Fortran (using none of the original source, just using the implementation as a guide) does the Fortran code inherit the license restrictions of the original code albeit in another language.
Actually, we probably need some sort of licensing guide that covers these kinds of issues and goes over whats usable and whats not (including copyrighted material from books etc)
@jacobwilliams the functions are a bunch of polymorphic math functions that can readily be adapted in stdlib.
The simple REPL it has is a separate idea, if a pure Fortran REPL would be of interest.
If I take a code written in another language (MATLAB and Python in this case) and completely rewrite it in Fortran (using none of the original source, just using the implementation as a guide) does the Fortran code inherit the license restrictions of the original code albeit in another language.
If you take program in one language and rewrite it to another language, that is considered derivative work. So if the original is, say, GPL licensed, then your derivative work must also be GPL licensed. If you want to avoid that, you can do a so called clean room implementation: one person write a specification based on the original code, and another person takes that specification and creates a new implementation based on that without ever looking at the original code.
My thinking is that it would be a high-level object oriented interface.
@jacobwilliams besides a high level interface, would you be against also having a low level non-OO API in stdlib
if others volunteer it?
A library for geodesy, geographic coordinate transforms for sensors, satellites, etc https://github.com/scivision/maptran3d
@rweed That sounds similar to what I was thinking. Do you have any of your examples in a repo somewhere I could examine? I hope to find some time soon to start working up an example.
@certik I have no objection to a low-level interface. I think in some cases it would be fairly easy to make. FYI: if you look at the SLSQP code, you will notice that the original reverse-communication interface is still there, and that's what the oo interface calls. However, I don't want to impose this pattern on all the solvers, since it makes for somewhat weird and not straightforward code in my opinion. I think for a "simple" interface to an oo solver, all we'd need to do is make a subroutine where everything is passed in as arguments (including non-type bound functions), and then the class is just created and run inside this subroutine. It would be easy to do I think... but the devil is in the details...
@jacobwilliams perfect, I agree. Where there's a will, there's a way. I will support you (and anybody) on a nice OO interface, and if you support me (and others) on a nice low level non OO interface, then I think we all, as a community, will get what we want.
Another issue is to choose the algorithms to make available to solve a certain problem.
E.g., for popular problems like one-dimensional root-finding there is plenty of choice (I'm afraid too much): there are some algorithms available in MATH77, @jacobwilliams has refactored and modernised zeroin
and I am pretty sure there could be other good implementations already available (see John Burkardt's large collection). For the numerical integration of one-dimensional functions there are many strategies and implementations available, etc.
Should we expose through the API all the routines available or we'll choose the "best" ones?
@epagone excellent question. I would allow the user to select the method, with some unified API. Let's look at how other projects do this, and then we can discuss what the best API would be.
SciPy
https://docs.scipy.org/doc/scipy/reference/optimize.html#root-finding
It looks like they have a root_scalar
function that accepts a method
argument:
root_scalar(method=’brentq’)
root_scalar(method=’brenth’)
root_scalar(method=’bisect’)
root_scalar(method=’ridder’)
root_scalar(method=’newton’)
root_scalar(method=’toms748’)
root_scalar(method=’secant’)
root_scalar(method=’halley’)
and in addition, they provide functions for each algorithm, such as brentq
, ridder
, etc.
Matlab
Matlab has fzero, which seems to be just one particular method. Then they have a general method solve where one can set a solver
argument. See also the section "Algorithms", but for root finding, it looks like they only have one method.
Julia
Julia has a separate package Roots.jl for this, which provides both Matlab like fzero
and their own find_zero
, which accepts a method as an argument (such as Bisection()
or Newton()
). The Roots.jl package was started as a result of this issue: https://github.com/JuliaLang/julia/issues/2447, where you can find the initial discussion about the API.
FWIW I vote for the SciPy interface, i.e. a string argument that selects the method to call. Clean and modular. Maybe we still need to select the "best" algorithm (that will be the default choice) and make the string argument optional.
@jacobwilliams I don't have the code in a repository but the following will give you an example of how I implement the abstract classes. I'll be the first to admint there is a lot of room for improvement but this works for my current applications.
An empty abstract class that is the parent class for all other optimizers. Probably could add some abstract methods but that forces you to implement all the abstract methods even if you don't need them all optimizerClass.F90.txt
module with abstract classes for objective functions and constraints
The basic class and subroutine interface for my cuckoo search code based on Walton's (see comments in module) modified cuckoo search MATLAB code which is unfortunately GPL2. I pass the objective function class as an argument instead of including it as a class data via a procedure pointer. I have generic methods (optimizer, init, and dealloc) to do the basic things you need for initialization etc.
A set of standard test cases for unconstrained optimization illustrating use of objective function class
Finally a test program for the cuckoo search routines. Note the actual code uses a lot of utilities for random numbers and my implementation of various MATLAB functions
The Boost C++ library simply provides the routines under different names:
- Root Finding Without Derivatives
// Bisection
template <class F, class T, class Tol>
std::pair<T, T>
bisect(
F f,
T min,
T max,
Tol tol,
boost::uintmax_t& max_iter);
// Bracket and Solve Root
template <class F, class T, class Tol>
std::pair<T, T>
bracket_and_solve_root(
F f,
const T& guess,
const T& factor,
bool rising,
Tol tol,
boost::uintmax_t& max_iter);
// TOMS 748 algorithm
template <class F, class T, class Tol>
std::pair<T, T>
toms748_solve(
F f,
const T& a,
const T& b,
Tol tol,
boost::uintmax_t& max_iter);
- Root Finding With Derivatives
// Newton-Raphson
template <class F, class T>
T newton_raphson_iterate(F f, T guess, T min, T max, int digits, boost::uintmax_t& max_iter);
// Halley
template <class F, class T>
T halley_iterate(F f, T guess, T min, T max, int digits, boost::uintmax_t& max_iter);
// Schr'''ö'''der
template <class F, class T>
T schroder_iterate(F f, T guess, T min, T max, int digits, boost::uintmax_t& max_iter);
Personally, I think the SciPy way fits to Fortran best, we provide both a root_scalar
/ find_zero
function where you specify the method using a string, and also a set of functions for each algorithm.
The find_zero
can then pick a default method based on the arguments. In SciPy the default is Brent's algorithm for bracketed roots and Newton or secant if not; see the excerpt from SciPy root_scalar
below:
# Pick a method if not specified.
# Use the "best" method available for the situation.
if not method:
if bracket:
method = 'brentq'
elif x0 is not None:
if fprime:
if fprime2:
method = 'halley'
else:
method = 'newton'
else:
method = 'secant'
if not method:
raise ValueError('Unable to select a solver as neither bracket '
'nor starting point provided.')
The root solvers are a good project to test some type templating mechanism like Jin2For. Having to modify both the single and double precision versions of each solver would be annoying
I agree with almost everything written by @ivan-pi. The only variation that I would suggest is to make Brent's method the default for scalar root-finding without derivatives and Newton-Raphson's algorithm when a derivative is provided (as per the theory). Furthermore, practical implementations rarely code exactly the textbook methods: e.g. some solvers (I don't recall at the moment exactly and I cannot check) use Newton's method with bracketing to detect divergence. I would focus more on the practical implementations to decide on the use-cases, although I understand that we need to give it a clear name to make them selectable.
The NLopt library contains several constrained and unconstrained nonlinear optimization routines and comes with a Fortran interface (geared toward the old F77 style of Fortran). Ironically, many of the routines are translations of old Fortran ones to C using f2c. I find the NLopt API very intuitive and simple to use. I made a pull request with a modern Fortran interface for NLopt a while ago. Maybe it could serve as a template for the stdlib one.
Would root bracketing routines be in scope for this issue? I believe so but I am struggling to find any prior art in this area. I remember some in the Numerical Recipes (that have a well-known troublesome licence) and szero
/dzero
of MATH77
(that have some partial bracketing embedded to expand the initial search range).
Are you aware of any high-quality bracketing routines available in the wild?
Usually such algorithms and high-quality implementations can be found on netlib - https://www.netlib.org/.
Op wo 6 jan. 2021 om 12:52 schreef Emanuele Pagone <[email protected]
:
Would root bracketing routines be in scope for this issue? I believe so but I am struggling to find any prior art in this area. I remember some in the Numerical Recipes (that have a well-known troublesome licence) and szero/dzero of MATH77 (that have some partial bracketing embedded to expand the initial search range).
Are you aware of any high-quality bracketing routines available in the wild?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/fortran-lang/stdlib/issues/87#issuecomment-755256439, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN6YR4HH6RVHPQYQYQS3B3SYRFIZANCNFSM4KC5DBQQ .
Usually such algorithms and high-quality implementations can be found on netlib - https://www.netlib.org/. Op wo 6 jan. 2021 om 12:52 schreef Emanuele Pagone <[email protected] … : Would root bracketing routines be in scope for this issue? I believe so but I am struggling to find any prior art in this area. I remember some in the Numerical Recipes (that have a well-known troublesome licence) and szero/dzero of MATH77 (that have some partial bracketing embedded to expand the initial search range). Are you aware of any high-quality bracketing routines available in the wild? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#87 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN6YR4HH6RVHPQYQYQS3B3SYRFIZANCNFSM4KC5DBQQ .
I am well aware of netlib but maybe I do not know the right keywords and I couldn't find much. I will check what other languages offer.
After searching a bit about bracketing routines in other languages, I have found only one solution and another partial one.
- IMO Boost has the best "packaged" solution (untested) with the routine Bracket and Solve Root:
bracket_and_solve_root is a convenience function that calls TOMS 748 algorithm internally to find the root of f(x). It is generally much easier to use this function rather than TOMS 748 algorithm, since it does the hard work of bracketing the root for you. It's bracketing routines are quite robust and will usually be more foolproof than home-grown routines, unless the function can be analysed to yield tight brackets.
TOMS 748 is an improved Brent's method with some additional constraints (see the Notes here).
- Julia's
find_zeros
(again, untested) claims to automatically find multiple roots in a specified interval, but I couldn't find anything regarding expanding an interval to try bracket a root.
To my surprise, I could not easily find anything relevant in python.
I would definitely say bracketing is part of this issue. From the looks of it, the Julia find_zeros
will attempt to find multiple roots on a defined interval, by dividing that interval into sub-intervals and checking for a difference in sign. It doesn't attempt to automatically expand the interval in search of a sign difference.
I guess the reason you cannot find much prior art is that writing at least a simple bracket search is not that hard. For a fixed interval you divide it into smaller pieces. For an open interval you expand (e.g. by multiply the interval with a constant factor). In applications where I required a root-finder my problems typically had some "intrinsic" brackets (e.g. related to known solution limits), or I had a good initial guess (e.g. from a previous time step), meaning I could safely use an iterative method without bracketing.
I think the way Boost does it is most flexible, either you can call the root-finding routine directly (on your interval of choice directly), or you allow the root-finder to expand/shrink the bracket as necessary. I would be in favor of having this. I would point out however, the Boost functions is limited to monotonic functions. Such bracketing routines will also fail at double roots (e.g. like in case of the function x^2). Unfortunately, no design will be completely foolproof.
I also found the NAG library has a solution: NAG FL Interface - C05 (Roots)
The IMSL Fortran library on the other hand only provides one bracketed root-finding method (Brent's) and one iterative one (Müller's method): IMSL - Chapter 7: Nonlinear equations
Wrt Netlib: I searched for "Newton" and came across quite a few references. I also have a number of the minimisation routines developed by Mike Powell. Probably of less relevance for the standard library is a collection of articles on probabilistic optimisation algorithms (I have implemented them in Tcl, my other favourite language, based on these articles). I can also dig up some other stuff concerning root finding and the like. Mind you: it is one thing to have an algorithm, it is quite another to ensure that they always provide an answer (or detect that things are going wrong).
It might also be a good idea to consider "automatic differentiation" as the basis for algorithms that require (first-order) derivatives, like Newton-Raphson.
Op do 7 jan. 2021 om 00:54 schreef Ivan Pribec [email protected]:
I would definitely say bracketing is part of this issue. From the looks of it, the Julia find_zeros will attempt to find multiple roots on a defined interval, by dividing that interval into sub-intervals and checking for a difference in sign. It doesn't attempt to automatically expand the interval in search of a sign difference.
I guess the reason you cannot find much prior art is that writing at least a simple bracket search is not that hard. For a fixed interval you divide it into smaller pieces. For an open interval you expand (e.g. by multiply the interval with a constant factor). In applications where I required a root-finder my problems typically had some "intrinsic" brackets (e.g. related to known solution limits), or I had a good initial guess (e.g. from a previous time step), meaning I could safely use an iterative method without bracketing.
I think the way Boost does it is most flexible, either you can call the root-finding routine directly (on your interval of choice directly), or you allow the root-finder to expand/shrink the bracket as necessary. I would be in favor of having this. I would point out however, the Boost functions is limited to monotonic functions. Such bracketing routines will also fail at double roots (e.g. like in case of the function x^2). Unfortunately, no design will be completely foolproof.
I also found the NAG library has a solution: NAG FL Interface - C05 (Roots) https://www.nag.com/numeric/nl/nagdoc_latest/flhtml/c05/c05conts.html
The IMSL Fortran library on the other hand only provides one bracketed root-finding method (Brent's) and one iterative one (Müller's method): IMSL
- Chapter 7: Nonlinear equations https://help.imsl.com/fortran/6.0/math/default.htm?turl=zbren.htm
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/fortran-lang/stdlib/issues/87#issuecomment-755785119, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN6YR3IE4WF57JEGHMFKLLSYTZ4VANCNFSM4KC5DBQQ .
I guess the reason you cannot find much prior art is that writing at least a simple bracket search is not that hard. For a fixed interval you divide it into smaller pieces. For an open interval you expand (e.g. by multiply the interval with a constant factor).
True, but if you want to do a proper job (that is efficient and takes into account subtle numerical problems), it's not that trivial. Have a look at the algorithm of MATH77
(that only looks outside of an interval to bracket a root):
When F1$\times \text{F2} > 0$ at the initial point, iterates are generated according to the formula $x = x_{\min } + (x_{\min } - x_{\max }) \times \rho$, where the subscript ``min" is associated with the $(x, f)$ pair that has the smallest value for $|f|$, and $\rho $ is 8 if $r = f_{\min }:/:(f_{\max } - f_{\min }) \geq 8$, else $\rho = \max (\kappa /4, r)$, where $\kappa $ is a count of the number of iterations that have been taken without finding $f$'s with opposite signs. If X1 and X2 have the same value initially (and F1 and F2 equal F(X1)), then the next $x$ is a distance $0.008 + |x_{\min }|/4$ from $x_{\min }$ taken toward~0. (If X1 = X2 = 0, the next $x$ is $-$.008.)
It is a bit like coding on your own the scalar root finding method: it's not that hard, but to do it properly it's much harder.
In applications where I required a root-finder my problems typically had some "intrinsic" brackets (e.g. related to known solution limits), or I had a good initial guess (e.g. from a previous time step), meaning I could safely use an iterative method without bracketing.
Me too...in some cases, but not all.
I think the way Boost does it is most flexible, either you can call the root-finding routine directly (on your interval of choice directly), or you allow the root-finder to expand/shrink the bracket as necessary. I would be in favor of having this. I would point out however, the Boost functions is limited to monotonic functions. Such bracketing routines will also fail at double roots (e.g. like in case of the function x^2). Unfortunately, no design will be completely foolproof.
Agreed
I also found the NAG library has a solution: NAG FL Interface - C05 (Roots)
The IMSL Fortran library on the other hand only provides one bracketed root-finding method (Brent's) and one iterative one (Müller's method): IMSL - Chapter 7: Nonlinear equations
Nice, thanks.
OK, just to get something started, I have: https://github.com/jacobwilliams/opt-lib
Currently, this has methods for non-derivative scalar root finding for a bracketed interval. The methods are in classes, but I also have a single function interface (root_scalar
). Example:
call root_scalar('pegasus',func,x1,x2,xzero,fzero,iflag)
Methods are:
- brent [basically zeroin]
- brentq [alternate brent from SciPy]
- brenth [alternate brent from SciPy]
- bisection
- anderson_bjorck
- ridders
- pegasus
- bdqrf
- muller
This is not final by any means and comments are welcome! I also have to figure out the preprocessor thing to get it to work with different real kinds.