Optim.jl icon indicating copy to clipboard operation
Optim.jl copied to clipboard

SPSA and MGD (model gradient descent) methods

Open jlbosse opened this issue 4 years ago • 3 comments

Hey everyone, I recently implemented the simulatenous stochastic perturbation algorthm (SPSA) and the model gradient descent (MDG, first developed in https://arxiv.org/abs/2005.11011) in a roughly Optim.jl compatible way. Is there interest in adding these methods to Optim.jl? If yes, I would make my code fully compatible to the Optim.jl interface and open a PR

jlbosse avatar May 07 '21 07:05 jlbosse

Can you describe the models and optimizers briefly?

pkofod avatar Jun 09 '21 11:06 pkofod

Both are gradient free methods for the optimization of noisy objective function. On a high level both work by estimating the gradient from noisy function evaluation and then doing gradient descent with that gradient estimate.

SPSA works by randomly choosing a perturbation vector Δx and then evaluation y_+ = f(x + Δx), y_- = f(x-Δx) and then estimating the gradient at x as grad = (y_+ - y_-) / |Δx|^2. SPSA is already implemented in BlackBoxOptim.jl, but the implementation there doesn't follow Optim.jl's interface.

MDG estimates by randomly choosing points x_i in a ball around x and then getting (noisy) function evaluations y_i = f(x_i) at these points. A model function (typically a quadratic function) is fitted to this data, and all previous data in this ball, and then used to estimate the gradient from this surrogate model

jlbosse avatar Jun 09 '21 14:06 jlbosse

MDG estimates by randomly choosing points x_i in a ball around x and then getting (noisy) function evaluations y_i = f(x_i) at these points. A model function (typically a quadratic function) is fitted to this data, and all previous data in this ball, and then used to estimate the gradient from this surrogate model

https://github.com/SciML/Surrogates.jl https://github.com/MrUrq/SurrogateModelOptim.jl

ChrisRackauckas avatar Jun 09 '21 17:06 ChrisRackauckas