Knet.jl
Knet.jl copied to clipboard
LBFGS optimizer
Although I know most of people use ADAM or SGD for the optimization of weight in NN. However, for some small NN architecture, the optimization method with line search (e.g., LBFGS) would be far more efficient. I wonder if the Knet developers would be interested in implementing LBFGS?
Optim.jl may already have this.
On Thu, Aug 6, 2020 at 9:33 AM Qiang Zhu [email protected] wrote:
Although I know most of people use ADAM or SGD for the optimization of weight in NN. However, for some small NN architecture, the optimization method with line search (e.g., LBFGS) would be far more efficient. I wonder if the Knet developers would be interested in implementing LBFGS?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/denizyuret/Knet.jl/issues/590, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN43JRAGCZDYPYGAQUKDL3R7JFCBANCNFSM4PWH6W5A .
Yes, I know. Can I using it with Kent? I tried it. But this is not very obvious. To be more specific, optim.jl needs to know the following to optimize a function of f(x).
- x
- f
- g
Both x and g need to be converted to a sort of 1D array. However, the main trouble is g. My student and I tried to transform g to something which can be accepted by optim.jl. But we failed. Not sure if someone has such experience.
x = Param(Array(...))
J = @diff f(x)
g = grad(J,x)
Should give you a g
with the exact type/shape as x
. See @doc AutoGrad
.