DiffEqFlux.jl icon indicating copy to clipboard operation
DiffEqFlux.jl copied to clipboard

Training loop => ERROR: DimensionMismatch && Multiple Shooting => Domain Error

Open VoSiLk opened this issue 4 years ago • 8 comments

https://discourse.julialang.org/t/what-does-the-maxiters-solver-option-do/37376/13

Environment details: OS: Windows 10 x64 Julia Version : 1.6.0

Packages:

  • DiffEqFlux v1.39.0
  • DiffEqSensitivity v6.52.0
  • DifferentialEquations v6.17.1
  • Flux v0.12.4
  • GalacticOptim v1.3.0
  • Optim v1.3.0
  • Zygote v0.6.14

Hey, I have solved the problems I had at 29th of June (I used the whole tspan = 0.1 s to 200 s and for the loss I just considered specific indices of it e.g. 0.1s to 1.6s etc.)

function loss(p) sol = predict_neuralode_rk4(p) N = sum(idx) return sum(abs2.(y[idx] .- sol'))/N end

Now I change the tspan see code below. In this case stiffness or instability weren’t problem.

However, now I get an error message of the loss function

ERROR: DimensionMismatch("arrays could not be broadcast to a common size; got a dimension with lengths 122 and 102").

I already have proved the dimensions of the prediction and the data (both 122). I don’t know the reason for the error and why it just occurs at iteration 13.

using DifferentialEquations, Flux, Optim, DiffEqFlux, DiffEqSensitivity, Plots, GalacticOptim

plotly()

ex = [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.24, 0.24, 0.24, 0.24, 0.24, 0.62, 0.62, 0.62, 0.62, 0.62, 0.62, 0.0, 0.0, 0.0, 0.0, 0.0, 0.38, 0.38, 0.38, 0.38, 0.38, 0.78, 0.78, 0.78, 0.78, 0.78, 0.78, 0.78, 0.78, 0.78, 0.78, 0.78, 0.78, 0.78, 0.78, 0.78, 0.44, 0.44, 0.44, 0.44, 0.44, 0.18, 0.18, 0.18, 0.18, 0.18, 0.18, 0.18, 0.18, 0.18, 0.18, 0.18, 0.18, 0.18, 0.18, 0.18, 0.56, 0.56, 0.56, 0.56, 0.56, 0.42, 0.42, 0.42, 0.42, 0.42, 0.3, 0.3, 0.3, 0.3, 0.3, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.88, 0.88, 0.88, 0.88, 0.88, 0.88, 0.88, 0.88, 0.88, 0.88, 0.88, 0.88, 0.88, 0.88, 0.88, 0.88, 0.88, 0.88, 0.88, 0.88]


idxs = [16, 21, 27, 32, 37, 52, 57, 72, 77, 82, 87, 102,122]

tspan = (0.1f0, Float32(length(ex)*0.1))

## process definition
f(x) = (atan(8.0 * x - 4.0) + atan(4.0)) / (2.0 * atan(4.0))

function hammerstein_system(u)
    y= zeros(size(u))
    for k in 2:length(u)
        y[k] = 0.2 * f(u[k-1]) + 0.8 * y[k-1]
    end
    return y
end

## simulation
y = Float32.(hammerstein_system(ex))

## model design
nn_model = FastChain(FastDense(2,8, tanh), FastDense(8, 1))
p_model = initial_params(nn_model)

function dudt(u, p, t)
    nn_model(vcat(u[1], ex[Int(round(10.0*t))]), p)
end

function predict_neuralode_rk4(p)
    _prob = remake(prob,p=p)
    Array(solve(_prob, RK4(), dt = 0.01f0, saveat=[0.1f0:0.1f0:Float32(tspan[2]);]))
end

## loss function definition
loss(p) = sum(abs2.(y_sub .- vec(predict_neuralode_rk4(p))))

function train(t0, t1, dudt, p, loss)
    global tspan = (t0, t1)
	u0 = [Float32.(y[1])]

    global y_sub = y[1:Int(round(t1*10.0f0))]	
    global prob = ODEProblem(dudt,u0,tspan,nothing)

    adtype = GalacticOptim.AutoZygote()
    optf = GalacticOptim.OptimizationFunction((x, p) -> loss(x), adtype)
    optfunc = GalacticOptim.instantiate_function(optf, p, adtype, nothing)
    optprob = GalacticOptim.OptimizationProblem(optfunc, p)

    res_tmp =  GalacticOptim.solve(optprob, ADAM(0.05), maxiters=50)

    optprob = remake(optprob,u0 = res_tmp.u)

    return GalacticOptim.solve(optprob, LBFGS(), allow_f_increases = false)
end

p_tmp = deepcopy(p_model)
sim_idx = 13
for i in 1:sim_idx
    println("Iteration $i - 0.1s - $(idxs[i]*0.1) s")
    res_tmp = train(0.1f0, Float32(idxs[i]*0.1), dudt, p_tmp, loss)
    p_tmp = deepcopy(res_tmp.u)
end

Furthermore, I tried to train with multiple shooting, but for this I’m getting a domain error although my specified group size is in the limits.

group_size =12
datasize = 102
continuity_term = 200
tspan = (0.1f0, Float32(102*0.1))
tsteps = range(tspan[1], tspan[2], length = idxs[12])
function loss_function(data, pred)
	return sum(abs2, data - pred)
end

function loss_multiple_shooting(p)
    return multiple_shoot(p_model, y[1:idxs[12]], tsteps, prob, loss_function, RK4(),
                          group_size; continuity_term)
end

res_ms = DiffEqFlux.sciml_train(loss_multiple_shooting, p_model)

ERROR: DomainError with 12: group_size can't be < 2 or > number of data points

Note that the code of multiple shooting depends on the above code.

VoSiLk avatar Jul 05 '21 09:07 VoSiLk

@adrhill I assume ode_data is assumed to be array of array?

ChrisRackauckas avatar Jul 05 '21 11:07 ChrisRackauckas

@ChrisRackauckas It should be an array of shape nstates x ntimesteps. Basically the format you'd get from calling Array(solve(...)).

@VoSiLk the DomainError in your multiple shooting code occurred because

group_size > size(ode_data, 2)

which means that the specified group size is longer than your dataset (in time).

adrhill avatar Jul 05 '21 15:07 adrhill

@adrhill I also tried different group sizes e.g. 3, but the error message is the same. The data which pass to the optimizer is sampled from 0.1 s to 10.2 s

VoSiLk avatar Jul 06 '21 11:07 VoSiLk

It's because the data shape is a vector.

ChrisRackauckas avatar Jul 06 '21 11:07 ChrisRackauckas

It optimizes without an error. Now just the result isn't good. Thanks.

VoSiLk avatar Jul 06 '21 12:07 VoSiLk

Do you know what it the reason for the Training loop => ERROR: DimensionMismatch?

VoSiLk avatar Jul 06 '21 12:07 VoSiLk

nstates x ntimesteps

Since your data was a vector (IIRC, I only quickly ran the code), your size was ntimesteps x 1, in which case group_size of 2 is larger than the dataset because it thinks that there's only one time step. If you want just a scalar as the return, you need to make that a 1 x n matrix instead.

ChrisRackauckas avatar Jul 06 '21 14:07 ChrisRackauckas

I adapt the train loop

function train(t0, t1, dudt, p, loss)
	tspan = (t0, t1)
	u0 = [Float32.(y[1])]
    global y_sub = Float32.(y[1:Int(round(t1*10.0f0))]')
    global prob = ODEProblem(dudt,u0,tspan,nothing)

    adtype = GalacticOptim.AutoZygote()
    optf = GalacticOptim.OptimizationFunction((x, p) -> loss(x), adtype)
    optfunc = GalacticOptim.instantiate_function(optf, p, adtype, nothing)
    optprob = GalacticOptim.OptimizationProblem(optfunc, p)

    res_tmp =  GalacticOptim.solve(optprob, ADAM(0.05), maxiters=50)

    optprob = remake(optprob,u0 = res_tmp.u)

    return GalacticOptim.solve(optprob, LBFGS(), allow_f_increases = false)
end

and the loss

loss(p) = sum(abs2.(y_sub - predict_neuralode_rk4(p)))

but the still the error occurs

ERROR: DimensionMismatch("dimensions must match: a has dims (Base.OneTo(1), Base.OneTo(122)), b has dims (Base.OneTo(1), Base.OneTo(102)), mismatch at 2") Stacktrace:

I also tried to evaluate it outside the loop. grafik

I even can evaluate the loss.

VoSiLk avatar Jul 06 '21 15:07 VoSiLk