DifferentialEquations.jl icon indicating copy to clipboard operation
DifferentialEquations.jl copied to clipboard

Slow solution times when using callbacks and large problem dimensions

Open grahamrow opened this issue 2 years ago • 8 comments

Greetings, I was surprised to see very slow solution times when using callbacks (~4x slower) even when those callbacks do nothing of interest. In trying to produce a minimal example it became clear that the slowdown depends on the size of the problem at hand. Let's start with a very boring Diff Eq over a large u:

using DifferentialEquations, BenchmarkTools

function blub!(du,u,p,t)
    du .= -0.01.*u
end

function test_no_callbacks(N::Int64)
    u0    = rand(N,N)
    tspan = (0.0,1000.0)
    prob  = ODEProblem(blub!, u0, tspan, saveat=5.0, maxiters=1e7, saveall=false)
    sol   = solve(prob);
end

@benchmark test_no_callbacks(400) 
# Gives mean time 251 ms
# Memory estimate: 267.38 MiB, allocs estimate: 954.

And now with callbacks:

function noop!(integrator)
    nothing
end

function test_callbacks(N::Int64)
    u0    = rand(N,N)
    tspan = (0.0,1000.0)
    cb    = PresetTimeCallback(0.0:5.0:1000.0, noop!; save_positions=(true,false))
    prob  = ODEProblem(blub!,u0,tspan, saveat=5.0, maxiters=1e7, saveall=false, callback=cb )
    sol   = solve(prob);
end

@benchmark test_callbacks(400)
# Gives mean time 420 ms
# Memory estimate: 267.39 MiB, allocs estimate: 772.

Here is a plot over a few different N values:

image

The slowdown depends in a nontrivial way on the system size. Expected behavior would be some fixed overhead for callback calls with a noop as above, obviously more overhead when actual processing is done.

I suspect this might have something to do with the callback caching, but there's not a big difference in allocation and callbacks appear to travel across a few packages with which I'm not familiar. Can be shown with Julia 1.6.0 or 1.7.2 and DiffEq v6.20.0. I haven't had a chance to try this with other Problem types yet. Thanks in advance for any thoughts!

grahamrow avatar May 06 '22 19:05 grahamrow

Can you provide a StatProfiler flamegraph?

https://github.com/tkluck/StatProfilerHTML.jl

That would highlight where the potential issues are.

ChrisRackauckas avatar May 14 '22 11:05 ChrisRackauckas

If you did:

function noop!(integrator)
    u_modified!(integrator,false)
end

does it go away? I assume a lot of it may just be running more initializations (especially for stiff ODE solvers with Jacobians) due to having to assume a potential discontinuity.

ChrisRackauckas avatar May 14 '22 11:05 ChrisRackauckas

Hi Chris, thanks for the replies. This u_modified function seems like a nice trick to avoid dealing with discontinuities, but it doesn't seem to change (i.e. improve) the performance. I have been running this with a few threads, which appears to be causing some weird things with the profiler. I'll report on profiling with a single thread shortly.

grahamrow avatar May 27 '22 01:05 grahamrow

statprof_no_callbacks.zip statprof_with_callbacks.zip Not knowing a better way of attaching the flamegraphs I've included the output directories here. Happy to package in some other manner. In this call the "with_callbacks" version is using the noop! function you suggested.

There do seem to be considerably more copy operations going on with the callbacks.

grahamrow avatar May 27 '22 01:05 grahamrow

I think you flipped the two?

But okay yeah, this is all interpolation performance.

What if you set interp_points = 0?

ChrisRackauckas avatar May 27 '22 01:05 ChrisRackauckas

Oops, yes I did flip them. I think that the PresetTimeCallback returns a DiscreteCallback which doesn't take interp_points as an argument. I'd presume they know exactly when to evaluate u unlike their continuous counterparts. Are DiscreteCallbacks storing before and after the indicated times? That alone would certainly change the performance.

grahamrow avatar May 27 '22 13:05 grahamrow

In DiffEqBase callbacks.jl I see:

#Base Case: Just one
@inline function apply_discrete_callback!(integrator,callback::DiscreteCallback)
  saved_in_cb = false
  if callback.condition(integrator.u,integrator.t,integrator)
    # handle saveat
    _, savedexactly = savevalues!(integrator)
    saved_in_cb = true
    @inbounds if callback.save_positions[1]
      # if already saved then skip saving
      savedexactly || savevalues!(integrator,true)
    end
    integrator.u_modified = true
    callback.affect!(integrator)
    @inbounds if callback.save_positions[2]
      savevalues!(integrator,true)
      saved_in_cb = true
    end
  end
  integrator.sol.destats.ncondition += 1
  integrator.u_modified,saved_in_cb
end

Seemingly it's always setting integrator.u_modified = true when the callback condition is met. EDIT: perhaps callback.affect! is where the actual function gets called and resets that value as you suggested above.

grahamrow avatar May 27 '22 13:05 grahamrow

Yes, it's a safety measure which you can turn off.

https://diffeq.sciml.ai/stable/basics/integrator/#SciMLBase.u_modified!

But your callback would have to only be a control callback to do that (for example, saving values) so it rarely comes up.

Your problem isn't that. Your problem is just the speed of the interpolation, which is something that needs to be worked on but I just haven't gotten to. If you decrease the interp points that should greatly lessen it though by hitting the interpolation a lot less.

ChrisRackauckas avatar Jun 03 '22 10:06 ChrisRackauckas