JuLIP.jl icon indicating copy to clipboard operation
JuLIP.jl copied to clipboard

Augmented Lagrangian

Open jameskermode opened this issue 7 years ago • 12 comments

This is just a reminder to go back and try JuLIP with the augmented Lagrangian implementation of constrained optimisation from https://github.com/cortner/ConstrainedOptim.jl

Related to issue #33

jameskermode avatar Feb 14 '17 17:02 jameskermode

Good point - do you have a test system?

cortner avatar Feb 14 '17 18:02 cortner

A constrained bond length would be the first canonical example. I will set something up and let you know if/when there are problems.

jameskermode avatar Feb 15 '17 09:02 jameskermode

there is now a new ConstrainedOptim which we can try at some point. But I'm assuming the basic Newton scheme we have in Experimental works for now?

cortner avatar Mar 12 '18 23:03 cortner

@lifelemons is still having robustness problems with the experimental constrained Newton scheme, even with the gradient-based linesearch, so I think we will give ConstrainedOptim a go.

IL027 avatar Mar 15 '18 18:03 IL027

(Oops, this is @jameskermode, signed in with wrong GitHub account)

IL027 avatar Mar 15 '18 18:03 IL027

This is the crack problem, yes? Maybe @lifelemons could remind me how I can access it and I’ll have another look?

cortner avatar Mar 15 '18 19:03 cortner

Here is an attempt at using ConstrainedOptim.jl to fix the length of a bond in a piece of bulk SW silicon with latest JuLIP. The optimiser converges and penalty goes to zero, but the constraint is not satisfied afterwards within the target threshold of +/- 1e-2, so I must be doing something wrong. I've tried increasing strength of initial penalty.

@cortner would you be willing to take a look?

Final output I get is

b = Any[2.35126, 2.39, 2.35126, 2.35126]
E = Any[-34.68, -34.6702, -34.68, -34.68]
using ConstrainedOptim
using JuLIP

a = bulk(:Si, cubic=true)
set_constraint!(a, FixedCell(a))
set_calculator!(a, StillingerWeber())

i = 1
j = 2
I1 = atomdofs(a, i)
I2 = atomdofs(a, j)
x0 = dofs(a)

@show I1
@show I2

I1I2 = [I1; I2]

blen(x) = norm(x[I2] - x[I1])

# energy, gradient and hessian for the interatomic potential
ener(x) = energy(a, x)
gradient!(g, x) = (g[:] = gradient(a, x); g)
hessian!(h, x) = (h[:,:] = hessian(a, x); h)

b = []
E = []

for target_bondlength = 2.3:0.1:2.6

    # constraint function
    con_c!(c, x) = (c[1] = blen(x) - target_bondlength; c)

    # Jacobian of constraint, shape [1, length(x)]
    function con_jacobian!(J, x)
        J[1,I1] = (x[I1]-x[I2])/blen(x)
        J[1,I2] = (x[I2]-x[I1])/blen(x)
        J
    end

    # Hessian contribution of constraint
    # We define auxiliary index arrays _I1 and _I2 that start at 1 to
    # allow the ForwardDiff hessian to be done only on relevent dofs, x[I1I2]
    _I1 = 1:length(I1)
    _I2 = length(I1)+1:length(I1)+length(I2)
    _blen(x) = norm(x[_I2] - x[_I1])
    _c(x) = _blen(x) - target_bondlength

    function con_h!(h, x, λ)
        h[I1I2, I1I2] += λ[1] * ForwardDiff.hessian(_c, x[I1I2])
    end

    df = TwiceDifferentiable(ener, gradient!, hessian!, x0)
    lx = Float64[]; ux = Float64[]
    lc = [-1e-2]; uc = [1e-2]
    dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
                                         lx, ux, lc, uc)

    # move atoms i and j so constraint is initially satisfied
    set_dofs!(a, x0)
    X = positions(a)
    u = a[j] - a[i]
    f = 1.0 - target_bondlength / norm(u)
    X[i] += 0.5 * f * u
    X[j] -= 0.5 * f * u
    set_positions!(a, X)
    x = dofs(a)
    @show target_bondlength, blen(x)
    
    res = optimize(df, dfc, x, IPNewton(show_linesearch=false, μ0=:auto), 
                    Optim.Options(show_trace = true, 
                                  allow_f_increases = true, 
                                  successive_f_tol = 2))
    @show res
    @show _c(res.minimizer[I1I2])
    @show blen(res.minimizer)
    
    push!(b, blen(res.minimizer))
    push!(E, ener(res.minimizer))
end

@show b
@show E

jameskermode avatar Apr 05 '18 14:04 jameskermode

Think I've got it working, mistake was not zeroing all elements of constraint Jacobian:

function con_jacobian!(J, x)
    J[1,:] = 0.0
    J[1,I1] = (x[I1]-x[I2])/blen(x)
    J[1,I2] = (x[I2]-x[I1])/blen(x)
end

jameskermode avatar Apr 05 '18 15:04 jameskermode

What about the hessian, would one need to zero that out too? Does that now show something more reasonable?

lifelemons avatar Apr 05 '18 15:04 lifelemons

let me know if I still need to look at it?

if this works, then it might be time to discuss how to best implement a more generic set of constraints that can be conveniently accessed? I.e. #20

cortner avatar Apr 05 '18 15:04 cortner

No, slightly confusingly the constraint Hessian contribution is additive, so that goes on top of the one from the objective, while the constraint Jacobian needs the complete vector.

jameskermode avatar Apr 05 '18 15:04 jameskermode

Thanks Christoph- working fine now, so no need to look at, but agree it would be good to implement a generalisation of the function into JuLIP in a more convenient way.

function minimize_constrained_bond!(a, i, j, bondlength)
    I1 = atomdofs(a, i)
    I2 = atomdofs(a, j)
    I1I2 = [I1; I2]

    # bondlength between atoms i and j
    blen(x) = norm(x[I2] - x[I1])

    # energy, gradient and hessian for the interatomic potential
    ener(x) = energy(a, x)
    gradient!(g, x) = (g[:] = gradient(a, x); g)
    hessian!(h, x) = (h[:,:] = hessian(a, x); h)
    
    # constraint function
    con_c!(c, x) = (c[1] = blen(x) - bondlength; c)

    # Jacobian of constraint, shape [1, length(x)]
    function con_jacobian!(J, x)
        J[1,:] = 0.0
        J[1,I1] = (x[I1]-x[I2])/blen(x)
        J[1,I2] = (x[I2]-x[I1])/blen(x)
    end

    # Hessian contribution of constraint
    # We define auxiliary index arrays _I1 and _I2 that start at 1 to
    # allow the ForwardDiff hessian to be done only on relevent dofs, x[I1I2]
    _I1 = 1:length(I1)
    _I2 = length(I1)+1:length(I1)+length(I2)
    _blen(x) = norm(x[_I2] - x[_I1])
    _c(x) = _blen(x) - bondlength

    function con_h!(h, x, λ)
        h[I1I2, I1I2] += λ[1] * ForwardDiff.hessian(_c, x[I1I2])
    end

    df = TwiceDifferentiable(ener, gradient!, hessian!, x0)
    lx = Float64[]; ux = Float64[]
    lc = [-1e-2]; uc = [1e-2]
    dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
                                         lx, ux, lc, uc)

    # move atoms i and j so constraint is initially satisfied
    X = positions(a)
    u = X[j] - X[i]
    f = 1.0 - bondlength / norm(u)
    X[i] += 0.5 * f * u
    X[j] -= 0.5 * f * u
    set_positions!(a, X)
    x = dofs(a)
    
    res = optimize(df, dfc, x, IPNewton(show_linesearch=false, μ0=:auto), 
                    Optim.Options(show_trace = true,
                                  extended_trace = false,
                                  allow_f_increases = true, 
                                  successive_f_tol = 2))
    set_dofs!(a, res.minimizer)
end

Meanwhile, @lifelemons please give this a go for your use case and let me know of any problems

jameskermode avatar Apr 05 '18 15:04 jameskermode