StochasticDelayDiffEq.jl Solution performance for network of delays

I am trying to move towards an implementation of the Kuramoto oscillator network model with time delays and stochastic forcing. My first attempt here is just a linear SDDE,

using StochasticDelayDiffEq

begin
    function f(du,u,h,d,t)
	aff = zero(u)
	for j=1:length(u)
	    for i=1:length(u)
                aff[i] += h(d, t - d[i,j]; idxs=j)
            end
        end
        du .= -u + 0.1*aff
    end
    function g(du,u,h,d,t)
        du .= 0.1
    end
    n = 100
    h(d,t;idxs=nothing) = idxs==nothing ? (ones(n) .+ t) : (ones(n)[idxs] .+ t);
    tspan = (0.,10.)
    ic = randn(n)
    d = rand(n,n)
end

prob = SDDEProblem(f, g, ic, h, tspan, d; constant_lags = d);
sol = solve(prob,EM(),dt=0.01,progress=true,progress_steps=10)

trying this on Julia v1.5 seems to spend several minutes without producing any output. With n=10, the solution is done in a few seconds. This suggests (to me) a super-linear scaling of some delay related data structures internally, and I am curious if I'm not using this the wrong way?

Aug 20 '20 22:08 maedoc

I'm not too surprised that it is quite a bit slower since the number of delays scales quadratically with n. Since you specify constant_lags (BTW I've never specified them as matrix but great if it works :smile:), the solver tracks the propagated discontinuities explicitly and reduces the time steps so it hits them (more or less) exactly up until a certain discontinuity order that depends on the chosen solver. You could try to not specify constant_lags - in this case the solver will not track the propagated discontinuities and not try to hit them exactly but will only decrease the step size if the error heuristic becomes too large (at least in DelayDiffEq, not completely sure ad hoc about the implementation in StochasticDelayDiffEq). This might speed up the computations since probably less steps are required at the expense of an increased error.

Aug 20 '20 23:08 devmotion

Thanks for the helpful details! I don't need the discontinuity propagation for now, I was just following the example in the unit tests. I'll drop that and see what happens.

Aug 21 '20 01:08 maedoc

That reminds me though, @YingboMa was mentioning that in some of Enright or Harier's work that discontinuities were not check for if steps were accepted, which is something we could look into adding.

in this case the solver will not track the propagated discontinuities and not try to hit them exactly but will only decrease the step size if the error heuristic becomes too large (at least in DelayDiffEq, not completely sure ad hoc about the implementation in StochasticDelayDiffEq)

The error checking adaptivity will do similar here. But note SDE integrators are such a lower order than discontinuities shouldn't have to propagate far, so we can definitely do better in the detection process.

Aug 21 '20 05:08 ChrisRackauckas

That reminds me though, @YingboMa was mentioning that in some of Enright or Harier's work that discontinuities were not check for if steps were accepted, which is something we could look into adding.

Dependent delays are already handled in this way (similar to RADAR5): https://github.com/SciML/DelayDiffEq.jl/pull/121

Aug 21 '20 08:08 devmotion

Thanks for the tips & info. For n=10, dropping constant_lags is 10x faster, for n=20 it's 100x. For my problem size of n=164 this makes it work, so I'm going to close this issue (since this was a feature not a bug in any case)

Aug 21 '20 12:08 maedoc

Yeah, 164... oof. Even with a 1st order method that's placing 26896 discontinuities. I see that is the issue: with dependent delays we don't do the extra computation if we accept a step, but with constant lags we step exactly to the computation so it's "normally" not a huge issue, but then in this case you are forced to his 26896 steps in the integration which might be making many that could be stepped over and still hit tolerance, so we probably could ignore some. This is something to think about.

Though n=10 should be an issue: if the normal solve was like 100 steps, you shouldn't be able to see the 10 discontinuity handling points, which makes me think something is going wrong. It might be propagating to higher order when 1st order is all that's needed?

Aug 21 '20 14:08 ChrisRackauckas

In any case the above example exaggerates the problem, since rand(n,n) can generate that many unique values. In the domain for this set of equations, these delays are bounded on an interval of 0 to 256 ms and an error of a few ms in the delay is not problematic. We therefore in practice round delays to the nearest integer number of time steps (using EM or Heun) so that the solver hits every delay exactly without the need to interpolate.

If I try this in the above code by constraining delays to integer multiples of step size,

d = rand(n,n);
dt = 0.01
d = floor.(d .^ 4 ./ dt) .* dt;

this results in 100 unique delay values (instead of 164^2), each on the solver time grid (since this is EM w/ dt=0.01). This is about 15% faster (comparing last of 4 runs to avoid jit overhead). Setting save_everystep=false reduces that by 45%. Switching methods to SRIW1 reduces by half again. This is fastest enough for my purposes.

Aug 21 '20 16:08 maedoc

Great! It would be good to establish some benchmark problems for this repo, since it's new enough that we don't have a set to optimize against yet. So if that ever becomes a public model, it would be nice if you could donate it 😄 .

Aug 21 '20 22:08 ChrisRackauckas

I'd be happy to contribute the model, though the analytic result (not on the solution itself but the critical parameter value for phase transition) I have is only for non delayed case. I'm checking to see if that can be relaxed. I am also trying to make work precision diagrams for the different solvers, with respect to this analytic result, which I could share as a notebook or something.

Aug 22 '20 10:08 maedoc

This may not be the format you had in mind, but this is what's in my editor right now, and I figured I'd copy-paste before I forget about this issue:

using StochasticDelayDiffEq, Statistics, DelimitedFiles, ZipFile

function get_tvb_connectome()
    fname = "conn76.zip"
    if ~isfile(fname)
        url = "https://github.com/the-virtual-brain/tvb-data/raw/master/tvb_data/connectivity/connectivity_76.zip"
        download(url, fname)
    end
    zf = ZipFile.Reader(fname)
    l = readdlm(zf.files[6])
    w = readdlm(zf.files[7])
    w, l
end

struct kp
    G::Float64
    σ::Float64
    w::Array{Float64,2}
    l::Array{Float64,2}
end

function kf(du,u,h,p,t)
    G = p.G/length(u)
    du .= 2*2*π/1000
    for j=1:length(u)
        for i=1:length(u)
            if i!=j
                uⱼ = h(p, t - p.l[i,j]; idxs=j)
                du[i] += G * p.w[i,j] * sin(uⱼ - u[i])
            end
        end
    end
end

function kg(du,u,h,p,t)
    du .= sqrt(2*p.σ)
end

w, l = get_tvb_connectome()
nn = size(w,1)
ic = rand(nn) .* 2 .* π;
kh(d,t;idxs=nothing) = idxs==nothing ? ic : ic[idxs];
d
prob = SDDEProblem(kf, kg, ic, kh, (0.,500.), kp(0.1, 0.01, w, l));
@time sol = solve(prob, SRIW1(),dt=1.0, reltol=1e-3, abstol=1e-3);

The Kuramoto oscillatory model is the simplest one we might use, others are more involved, but they are are all fairly similar in structure.

We intend to do work precision diagrams for a few solvers on models like this (which are frequently memory bound because of the time delays), so if/when that's available, I'll also post a link here.

Sep 01 '20 21:09 maedoc

That's perfect, thanks.

Sep 02 '20 01:09 ChrisRackauckas

I'm making a little progress toward the work precision diagram, but I noticed that increasing abstol up to 1e-5 increase runtime, then to 1e-6, run time is dropping, @time shows allocations follow runtime so I guess the adaptivity is taking fewer steps. It is surprising to me, but I wondered if this could be related to the delays?

Sep 02 '20 17:09 maedoc

There's quite a few cases where this happens, even in ODEs. It's rare, but sometimes solving a problem more accurately makes it better behaved since being off the manifold could induce things like stiffness, it's not exactly monotonic w.r.t. tolerance. that said, it should be "mostly monotonic" with respect to tolerance.

Sep 02 '20 18:09 ChrisRackauckas

StochasticDelayDiffEq.jl StochasticDelayDiffEq.jl copied to clipboard

Solution performance for network of delays

StochasticDelayDiffEq.jl
StochasticDelayDiffEq.jl copied to clipboard