Angora Look into alternatives to gradient descent

Look into alternatives to gradient descent

Open Eh2406 opened this issue 6 years ago • 3 comments

Section 3.4 of the paper describes using gradient descent with a 2-point method for computing the gradient vector. This basically involves doing O(d) function calls to find the approximate gradient then doing a small number of calls to move in that direction. There is a long history of Derivative free Optimization Methods, that try to make each function evaluation do some of both. For example some methods keep an approximation to the function shape and try calling the function on the minimum of the approximation, this new result is then used to make a better approximation.

I would start by looking into:

Nelder–Mead, keep the d best function calls. Use a best fit plane through them as a approximate gradient.
NEWUOA witch has a clever way to approximate the polynomial best fit to the previously evaluated points.
BFGS, is not what I would have guessed, as it uses the gradient, but appears to be the default in scipy.optimize. Mabey look at the source code to figure out how it is approximating the gradient.

Jan 23 '19 19:01 Eh2406

Great! You can see the angora/fuzzer/src/search/gd.rs code as an example. Don't hesitate to ask me if you have any problem.

Jan 24 '19 08:01 spinpx

That is going to be the topic of my BA thesis. I have a few more ideas on this, I'll post the results when I'm done :)

Feb 10 '19 19:02 greshake

@greshake How is your thesis going? Did you find anything that worked? Did you try things that turned out not to pay off?

Jul 10 '20 20:07 Eh2406

Angora Angora copied to clipboard

Look into alternatives to gradient descent

Angora
Angora copied to clipboard