MariusDrulea

Results 75 comments of MariusDrulea

Edit after posting the issue. The behavior might be the expected one. I think it must be the case as we want to use only explicit loss functions.

@ToucheSir just noticed the correct way to call the implicit function is like this `∇m = gradient(()->loss_fun(), ps)`, we have to provide a function with no arguments and **also the...

Something I just remarked: ``` f(x) calls: 53 ∇f(x) calls: 53 => there are 2 parameters; for each parameters we need 2 evaluations of f for gradient => 2 params...

> but is it better for the interpretation of f_calls_limit to vary depending on whether Optim creates gradients rather than takes them in as blackbox inputs? I think so, why...

I think perhaps a better description of what that is (objective calls) and what is not (does not count objectives in gradient), would be sufficient: https://julianlsolvers.github.io/Optim.jl/stable/#user/config/#general-options.

I have added a better description in this https://github.com/JuliaNLSolvers/Optim.jl/pull/1054. The PR also shows the minimizer at the end of minimization. Let me know if this is also appropriate.

Another related example, using `sum` instead of `prod`: Pseudocode + explanations: ``` f = x1 + x2 grad_f = (1, 1) fg = sum(grad_f) = 2 grad_fg = [0, 0]...

As a side info, the hessian of the `prod` function works in AutoGrad.jl: ``` using AutoGrad x = Param([1,2,3]) # user declares parameters p(x) = prod(x) hess(f,i=1) = grad((x...)->grad(f)(x...)[i]) hess(p,...

Some really good news. If I use 2 separate params the hessian works. Next is to adjust this to work for arrays. ``` using Tracker x1 = param(1) x2 =...

Higher order derivatives of functions with a single argument, work correctly: ``` using Tracker f(x) = sin(cos(x^2)) df(x) = gradient(f, x, nest=true)[1] d2f(x) = gradient(u->df(u), x, nest=true)[1] d3f(x) = gradient(u->d2f(u),...