OptimPack icon indicating copy to clipboard operation
OptimPack copied to clipboard

Switching COBYLA, NEWUOA, and BOBYQA to the PRIMA implementation?

Open zaikunzhang opened this issue 9 months ago • 13 comments

Dear OptimPack maintainers,

This is Dr. Zaikun Zhang from The Hong Kong Polytechnic University. Together with Professor N.I.M. Gould, I am responsible for maintaining the renowned derivative-free optimization solvers of the late Professor M.J.D. Powell, namely COBYLA, UOBYQA, NEWUOA, BOBYQA, and LINCOA. I am the author of PRIMA, which provides the modernized reference implementation for these solvers.

Thank you for making COBYLA, NEWUOA, and BOBYQA available in OptimPack. However, I note that it is currently done based on the unmaintained Fortran 77 implementation, using a C translation by f2c. Although the Fortran 77 implementation is truly a masterpiece, it contains many bugs, which can lead to segmentation faults or infinite loops. For example, see Section 4.4 of my recent paper on Powell's methods and many GitHub issues. It is strongly discouraged to use the Fortran 77 version of these solvers anymore.

That said, I suggest you consider switching to the PRIMA implementation of these solvers, and possibly make UOBYQA and LINCOA available in OptimPack as well. I will be very happy to provide assistance on the Fortran side if you would like to do so.

Besides, the following by @amontoison may be interesting to you:

https://github.com/JuliaPackaging/Yggdrasil/blob/master/P/PRIMA/build_tarballs.jl

FYI, the inclusion of PRIMA solvers in SciPy is under discussion, and the major SciPy maintainers are quite positive.

Thanks and regards, Zaikun ZHANG Ph.D. and Assistant Professor Dept. App. Math., Hong Kong Polytechnic University

zaikunzhang avatar Sep 14 '23 18:09 zaikunzhang

Dear Zaikun Zhang,

Thank you for your interest and comments. I can only agree: it's absolutely essential that Powell's algorithms are available in modern programming languages, with bug fixes and improvements.

Like you, I struggled with original FORTRAN 77 codes to make them portable and relax some constraints so that the methods could be easily called from foreign languages. I was more efficient in C and I think that it is easier to write portable C. So I used f2c to intialize the process and then heavily modify/simplify the code to make it readable (I could remove all goto's in NEWUOA but give up for the other methods, I've seen that you have been able to cleanup the code from these "beasts") and maintainable and to attempt to make the optimzers agnostic to the language into which the objective function is coded. To do that, I used "reverse communication" technique. Among the other improvements are the ability to scale the variables of the problem (using a simple Euclidean norm for the trust region and for the convergence is not suitable for many practical problems) and to hide the "workspaces" to the end-user.

I would be happy to use PRIMA, as you and Tom Ragonneau seem to have done a much better job than I've done. You mention the port in Yggdrasil so I guess that Julia's artifact for PRIMA (I found PRIMA_jll) is available but is there any Julia interface built around it?

Éric Thiébaut

emmt avatar Sep 15 '23 11:09 emmt

Dear Éric,

Thank you for your encouraging words.

I fully agree that the Fortran 77 code is quite challenging to deal with. It took me three years, day to day, full time (3 x 360 x 10 plus hours) to decode the GOTO maze and sort things out. The difficulty is exactly the reason I worked on it: I hope I am the last one in the world to decode a maze of 244 GOTOs in 7939 lines of Fortran 77 code — I did this for three years and I do not want anyone else to do it again. I did not ask Tom to do this, because, as his PhD supervisor, I hope he could have an easier life.

I am very impressed by the improvements you have made to Powell's solvers, despite the above-mentioned difficulties with the original F77 code.

you mention the port in Yggdrasil so I guess that Julia's artifact for PRIMA (I found PRIMA_jll) is available but is there any Julia interface built around it?

The artifacts are made by @amontoison, and I suppose he intends to provide a Julia wrapper for PRIMA as well, but it is not finished yet. Meanwhile, I think it is still a good idea to switch the old F77-based solvers in OptimPack to the PRIMA version, or even include the other two solvers that are missing (UOBYQA and LINCOA) from OptimPack for now.

Thanks and regards, Zaikun

zaikunzhang avatar Sep 15 '23 12:09 zaikunzhang

Thanks for your answer. I hope you don't mind if we switch to a more informal conversation (as is usually done on Github issues).

I can imagine how painful it can be 3 years of re-coding. But the result seems very useful for others. I had a look at the C interface of PRIMA and I can already see the benefits of a similar interface for these methods (without the needs to allocate workspaces, etc.) not to speak about other improvements (bugs fixes, etc.). I do not see the np parameter, am I missing something or is this deliberate? I guess np = 2n + 1 as recommended by Mike Powell? Anyway, interfacing your library seems pretty easy so I will give it a try and let you know...

emmt avatar Sep 15 '23 13:09 emmt

I guess you meant npt, whch is the number of points in the interpolation set. It is still an input of the Fortran implementation (see, e.g., NEWUOA). Of course, as you probably know, it exists only in NEWUOA, BOBYQA, and LINCOA.

However, you are right that it is missing from the C interface --- I did not notice this before. The C interface is kindly provided by @jschueller. I will suggest he include npt as an input, because this is an important input. npt = 2n+1 works well in general, but Professor Powell did mention that a smaller npt might be beneficial sometimes. It should be defaulted to 2n+1, but should also be exposed to the user.

Anyway, interfacing your library seems pretty easy so I will give it a try and let you know...

I look forward to hearing about your success! I am at your disposal to provide assistance on the Fortran side.

zaikunzhang avatar Sep 15 '23 13:09 zaikunzhang

You are right npt indeed!

emmt avatar Sep 15 '23 13:09 emmt

I can imagine how painful it can be 3 years of re-coding. But the result seems very useful for others.

It was indeed painful --- if not depressive. It was like going through a tunnel that never ends. Finally, it fortunately ended.

As long as the result is useful to the researchers and practitioners, my pain will be compensated.

It was a promise I made to Professor Powell.

zaikunzhang avatar Sep 15 '23 13:09 zaikunzhang

I now have a premiminary version of a Julia interface that suits my needs and, I hope, others needs as well. The repository is PRIMA.jl. It is not yet fully documented nor fully tested. But all optimizers work on the same examples as those of the C interface. I'd be happy to merge this with the pending work to build a full Julia interface you have mentioned earlier (I don't want to short-circuit anyone) but I could not wait for testing your version of Mike Powell's algorithms in Julia...

A few remarks:

  • I had to implement a per-thread stack of pointers to the user-defined objective functions (hence written in Julia) to allow for calling the optimizers in different threads and for having hierarchical optimization (e.g. the objective function itself needs to call one of the PRIMA optimizers, possibly the same). I have seen that, in the C interface, a data argument has been added to the objective function. But this has not yet been propagated in PRIMA_jll and I am not sure that this is the correct way to do that. Is there some guarantees that your (@zaikunzhang) FORTRAN90 algorithms can be used in such a way (that is being thread-safe and stackable). I remeber that, in Mike Powell's original FORTRAN77 version, the onjective function must have a given name (it was not part of the arguments of the optimizer).

  • For COBYLA, NEWUOA, and LINCOA, I had a warning that IPRINT is incorrect and that 0 is used instead. I do not have such a problem with UOBYQA and COBYLA. I have not invistagated whether this error is in my ocde or in the C wrapper or elsewhere.

Anyway, apart from debugging, the next step for me is to compare this new versions with my own, before switching to PRIMA for real (and forget my attempt to port these algorithms).

emmt avatar Sep 21 '23 21:09 emmt

Fantastic!

I had to implement a per-thread stack of pointers to the user-defined objective functions (hence written in Julia) to allow for calling the optimizers in different threads and for having hierarchical optimization (e.g. the objective function itself needs to call one of the PRIMA optimizers, possibly the same).

This is topical. See https://github.com/libprima/prima/issues/84

I have seen that, in the C interface, a data argument has been added to the objective function. But this has not yet been propagated in PRIMA_jll and I am not sure that this is the correct way to do that. Is there some guarantees that your (@zaikunzhang) FORTRAN90 algorithms can be used in such a way (that is being thread-safe and stackable).

@jschueller Julien, could you answer this question (thread-safety of the data pointer)?

I remeber that, in Mike Powell's original FORTRAN77 version, the onjective function must have a given name (it was not part of the arguments of the optimizer).

Yes, this is also the case for the new implementation.

For COBYLA, NEWUOA, and LINCOA, I had a warning that IPRINT is incorrect and that 0 is used instead. I do not have such a problem with UOBYQA and COBYLA. I have not invistagated whether this error is in my ocde or in the C wrapper or elsewhere.

This is interesting. @jschueller Julien, could you check whether the C inerface sets iprint correctly? It should be 0, 1, 2, 3, or -1, -2, -3.

Anyway, apart from debugging, the next step for me is to compare this new versions with my own, before switching to PRIMA for real (and forget my attempt to port these algorithms).

Great! I hope to point out the PRIMAis developed with derivative-free optimization applications in mind. In these applications, function evaluations dominate the cost. Each function evaluation may take seconds, minutes, hours, days, ... The evaluations most likely come from complicated simulations (simulation-based optimization).

Hence PRIMA is optimized to reduce the number of function evaluations, sometimes at the cost of flops within the solver. If we count the number of function evaluations, PRIMA will win according to my extensive benchmarking. However, if you test the computing time with functions that are easy to evaluate (each evaluation takes 0.01 seconds, for example), then the old Fortran 77 code will win, because the numerical linear algebra (flops) of the solver itself will become the major cost, and no one can do this part better than Professor Powell.

See also the comments here: https://github.com/libprima/prima/issues/29#issuecomment-1590839885

Thanks.

zaikunzhang avatar Sep 21 '23 23:09 zaikunzhang

  • we did test that the algorithms can be called recursively, see #85, but not in different threads
  • i print does works as expected in the C interface

jschueller avatar Sep 22 '23 05:09 jschueller

The iprint issue is now solved for Julia package PRIMA.jl with the 0.7.1 version of PRIMA_jll built by Alexis @amontoison: all the 5 algorithms of the PRIMA library work fine in that respect.

emmt avatar Sep 22 '23 10:09 emmt

@zaikunzhang, I just saw @emmt did mention scaling of the variables, I think the original f77 code did not do that, does prima do some kind of preconditionning now ?

jschueller avatar Sep 22 '23 15:09 jschueller

Scaling of variables, e.g. to make them of same magnitude, is also of importance for the trust region radius. This can be implemented in the high level interfaces via keywords or optional arguments.

emmt avatar Sep 22 '23 16:09 emmt

Not in the Fortran implementation. It is better done in the interfaces, provided the "upper-level" language is more expressive, which is the case for Python, MATLAB, Julia, R ... It is often a good idea in practice.

It is done in the MATLAB interface:

https://github.com/libprima/prima/blob/main/matlab/interfaces/private/preprima.m#L295-L308 https://github.com/libprima/prima/blob/main/matlab/interfaces/private/postprima.m#L362-L399

It will not be done in the Fortran code. The implementation would be complicated due to the lack of lambda function / anonymous function in Fortran. Why are lambda / anonymous functions needed? Because we need to redefine the objective / constraint functions according to the scaling.

Thanks.

zaikunzhang avatar Sep 22 '23 16:09 zaikunzhang