MariusDrulea comments

Results 75 comments of


                                            MariusDrulea

no gradients if we save the Flux.params into a variable

Edit after posting the issue. The behavior might be the expected one. I think it must be the case as we want to use only explicit loss functions.

no gradients if we save the Flux.params into a variable

@ToucheSir just noticed the correct way to call the implicit function is like this `∇m = gradient(()->loss_fun(), ps)`, we have to provide a function with no arguments and **also the...

meaning of f_calls_limit is misleading

Something I just remarked: ``` f(x) calls: 53 ∇f(x) calls: 53 => there are 2 parameters; for each parameters we need 2 evaluations of f for gradient => 2 params...

meaning of f_calls_limit is misleading

> but is it better for the interpretation of f_calls_limit to vary depending on whether Optim creates gradients rather than takes them in as blackbox inputs? I think so, why...

meaning of f_calls_limit is misleading

I think perhaps a better description of what that is (objective calls) and what is not (does not count objectives in gradient), would be sufficient: https://julianlsolvers.github.io/Optim.jl/stable/#user/config/#general-options.

meaning of f_calls_limit is misleading

I have added a better description in this https://github.com/JuliaNLSolvers/Optim.jl/pull/1054. The PR also shows the minimizer at the end of minimization. Let me know if this is also appropriate.

Nested derivatives do not work for functions with arrays as argument

Another related example, using `sum` instead of `prod`: Pseudocode + explanations: ``` f = x1 + x2 grad_f = (1, 1) fg = sum(grad_f) = 2 grad_fg = [0, 0]...

Nested derivatives do not work for functions with arrays as argument

As a side info, the hessian of the `prod` function works in AutoGrad.jl: ``` using AutoGrad x = Param([1,2,3]) # user declares parameters p(x) = prod(x) hess(f,i=1) = grad((x...)->grad(f)(x...)[i]) hess(p,...

Nested derivatives do not work for functions with arrays as argument

Some really good news. If I use 2 separate params the hessian works. Next is to adjust this to work for arrays. ``` using Tracker x1 = param(1) x2 =...

Nested derivatives do not work for functions with arrays as argument

Higher order derivatives of functions with a single argument, work correctly: ``` using Tracker f(x) = sin(cos(x^2)) df(x) = gradient(f, x, nest=true)[1] d2f(x) = gradient(u->df(u), x, nest=true)[1] d3f(x) = gradient(u->d2f(u),...