optiland icon indicating copy to clipboard operation
optiland copied to clipboard

Calculation of Gradient of operand with respect to variable

Open Jonas231 opened this issue 4 months ago • 8 comments

Fill out the following checklist before submitting your problem. This helps us to respond more effectively and keep the problem tracker focused. ->

Checklist

  • [x] I searched the [existing issues] (https://github.com/HarrisonKramer/optiland/issues) and [discussions] (https://github.com/HarrisonKramer/optiland/discussions) for a similar question or function request. I read the [documentary] (https://optiland.readthedocs.io/en/latest/) and tried to find an answer there. I use the latest version** of Optiland.
  • [x] I have included all necessary contexts.

Thank you for taking the time to do this - it really helps us to help you!

Feature Request

Does your feature request relate to a problem? Please describe.

It would be great to be able to calculate the change of all operand values in relation to the change of any variable by a small increment (gradient), for example by a finite method of difference. This would result in the Jacobian matrix. In this way, the subdued, least square LM method of local optimization defined in numpy (minpack) could be enhanced with a better damping scheme that results in faster convergence.

** Describe the solution you want** A clear and concise description of what you want to happen.

Calculate the Jacobian matrix by perhaps retrieving it via Numpy (Minpack), if possible. Describe alternatives you have considered Write an interface to Modern Minpack or https://github.com/pkgw/pwkit/blob/master/pwkit/lmmin.py. Then change one of these packages and implement PSD method.

Additional context Final goal: Implementation of Don Dilworth's Pseudo-Second Derivative Local Optimization (PSD) method, which is 10...100 x faster than other damping schemes (as demonstrated in SYNOPSYS by Osdoptics). See: http://dx.doi.org/10.1117/12.949224

Jonas231 avatar Aug 31 '25 16:08 Jonas231

Link to the most recent paper about PSD: https://www.researchgate.net/publication/271488365_Automatic_Lens_Optimization_Recent_Improvements

Jonas231 avatar Aug 31 '25 20:08 Jonas231

The Zemax guys also tried to implement it: https://community.zemax.com/got-a-question-7/new-optimization-methods-1098

Jonas231 avatar Aug 31 '25 20:08 Jonas231

And in short described in words here: https://www.mdpi.com/2313-433X/4/12/137

Jonas231 avatar Aug 31 '25 20:08 Jonas231

Hello @Jonas231 ,

Thank you for opening this issue and for providing the links to the related resources.

While I am not familiar with Minpack, not with the the Modern Minpack you mentioned, I can provide you some preliminary results that compare the performance of a PSD vs. DLS optimizers for a simple problem.

I have been working in a custom made DLS algorithm, using a modern (Nielsen's) damping factor update rule, with stagnation triggers. After reading through the majority of the resources you mentioned, I have implemented the PSD method, coupled with this modern update rule, and the results were interesting.

-> Optimization Problem:

  • optimization of a singlet lens, front surface, to beam shape the input collimated gaussian beam into a uniform flat-top irradiance profile. Two surfaces (Even Asphere and Forbes_QBFS), and two methods (DLS and PSD) are compared.

-> Optimization Targets:

  • generate the target values for the operand real_y_intercept, following the ray mapping equation for which the derivation can be found in https://optics.ansys.com/hc/en-us/articles/42661743954835-How-to-design-a-Gaussian-to-Top-Hat-beam-shaper .

-> Optimization Variables:

  • Even Asphere System: Radius + Conic + 7 coefficients
  • Forbes QBFS System: Radius + Conic + 7 coefficients (with fixed normalization radius at value equal to the diameter of the lens)

Results:

Image

Rough Explanation:

  1. First, let us compare the performance of the optimizers for the Even Asphere system. It is clear that for such system, the PSD provides faster convergence.
  2. Second, let us now compare the performance of the optimizers when the surface being optimized is the Forbes QBFS. It is evident that in this situation, the PSD presents no significant advantage over the custom DLS as above described. In fact, simply by reformulating the problem describing the surface as a Forbes Qbfs has immediate advantages in convergence, even for this DLS optimizer. As the Forbes polynomials form an orthogonal basis, the solution space is much more well behaved, and therefore having the information on the second order derivative is not as useful as in the scenario of the Even Asphere (which creates a poorly behaved solution space, where having that crucial information about the local curvature of the solution space is critical).

Another interesting comparison perhaps is the folllowing, in which we apply a "stagnation trigger" to both optimizers, in the exact same way, triggering an update to the normalization radius of the Forbes surface. Here, we have:

Image

which revealed us another, and better, minimum of the MF.

This is a work in progress, and I do not think we need to over-complicate and writing an interface to modern Minpack, nor changing those packages and implementing the PSD method based off them. MAybe i am too pessimistic but honestly, I think it is an unnecessary complication for this feature. We can simply add a new class in optimization.py as I have for testing the performance of this optimizer, and implement the method directly. I can be wrong however and not see the advantage on using the more complicated path you described, so please enlighten me better, if that is the case.

NOTE: the results, as stated are preliminary, and the custom DLS we have implemented is yet to be further tested and it does not represent the final implementation.

Best, Manuel

manuelFragata avatar Sep 01 '25 12:09 manuelFragata

Dear Manuel,

wow you are working on a custom DLS and the results are very interesting. I am glad that it is most probably not necessary to touch the Fortran of the minpack codes. Maybe only to squeeze out the last bit of performance (real duration of one cycle in s). I also favour your approach. I think also Dilworth detected stagnation and in this case triggered some randomizing algo (simulated annealing e.g) in order to jump out of a local minimum if stuck in one, doing it either manually or automated. Reoptimizing locally until the deepest or most insensitive minimum is found during global optimization. Most likely as also written in the sources very complex optimization problems with many operands, variables, configs will profit more from PSD variants I...III, although your example is quite nice.

Best regards, Jonas

Jonas231 avatar Sep 01 '25 18:09 Jonas231

Hi there @Jonas231 ,

Thank you for your reply. I see your point, and perhaps, if everyone is on board with the idea, I could research a bit on strategies to improve the performance as much as possible, to get those so desirable speed gains for the optimization (this could take more than two weeks, given that I am quite busy with some other things). I will experiment with different strategies, and keep everyone updated here.

Also agree with you @Jonas231 , and maybe you could suggest a list of optimization problems (i.e. systems and their variables and constraints) to test these optimizers with, and I can make a more comprehensive comparison? I could also come up and look for this systems myself, but would be nice to have your help here :) just lmk!

EDIT: @HarrisonKramer, what do you think about using a differentiable ray tracing call within the optimizers to reduce drastically the computation of the jacobian and gradient of the merit function. I think it is the most logical and clear improvement I can see so far. I can run a quick check, shouldnt be too difficult

Best, Manuel

manuelFragata avatar Sep 02 '25 09:09 manuelFragata

Hi @manuelFragata,

I can provide some examples of failed designs. Many of the systems that I design have multiple configurations like f-theta lenses with 2 or more axis galvo scanners, zoom beam expanders, ... So at the moment it is hard to model or optimize them in Optiland.

But I will also look in some books... Best, Jonas

Jonas231 avatar Sep 02 '25 18:09 Jonas231

Hi there @Jonas231 ,

Thank you, having those examples of failed designs would be good! I guess that for now we could simply model those more complex systems only for three configs: central, then diagonal scanning area image points. We can try. Just lmk!

Best, Manuel

manuelFragata avatar Sep 04 '25 10:09 manuelFragata