BoundaryValueDiffEq.jl
BoundaryValueDiffEq.jl copied to clipboard
[WIP] A Faster (Experimental) Version of Multiple Shooting
Main Idea
Place Nodes at points the BC is evaluate to minimize ODE Sensitivity Cost
How it is done?
- Create a wrapper around the ODESolution and capture the calls and store them in a vector!
- Indexing not supported
- Once we locate the points we just place the nodes at those points and continue as usual with Multiple Shooting
- For the BC part, we no longer have to use Forward Sensitivity.
- We don't support grid-coarsening for now. But not very hard to support.
TODOs
- [ ] Tests
- [ ] Share more code
- [ ] Grid Coarsening
- [ ] Some more detailed benchmarks
- [ ] Ensure Project.toml doesn't have unnecessary dependencies
Benchmark
using BoundaryValueDiffEq, OrdinaryDiffEq
tspan = (0.0, π / 2)
function simplependulum!(du, u, p, t)
g, L, θ, dθ = 9.81, 1.0, u[1], u[2]
du[1] = dθ
du[2] = -(g / L) * sin(θ)
end
function bc_pendulum!(residual, u, p, t)
t0, t1 = extrema(t)
residual[1] = u((t0 + t1) / 2)[1] + π / 2
residual[2] = u(t1)[1] - π / 2 # the solution at the end of the time span should be pi/2
end
bvp1 = BVProblem(simplependulum!, bc_pendulum!, [π / 2, π / 2], tspan)
alg1 = MultipleShooting(100, Tsit5(); auto_static_nodes = Val(true))
alg2 = MultipleShooting(100, Tsit5(); auto_static_nodes = Val(false),
grid_coarsening = false)
@benchmark solve(bvp1, alg1)
#=
BenchmarkTools.Trial: 713 samples with 1 evaluation.
Range (min … max): 4.286 ms … 26.760 ms ┊ GC (min … max): 0.00% … 62.69%
Time (median): 5.920 ms ┊ GC (median): 0.00%
Time (mean ± σ): 6.995 ms ± 3.894 ms ┊ GC (mean ± σ): 14.13% ± 17.30%
▁█▅
▄▆▆████▅▅▄▄▃▂▃▂▂▁▂▁▂▁▁▁▂▁▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂▃▃▃▂▂ ▃
4.29 ms Histogram: frequency by time 21.3 ms <
Memory estimate: 6.60 MiB, allocs estimate: 74812.
=#
@benchmark solve(bvp1, alg2)
#=
BenchmarkTools.Trial: 112 samples with 1 evaluation.
Range (min … max): 24.958 ms … 94.565 ms ┊ GC (min … max): 0.00% … 47.87%
Time (median): 33.904 ms ┊ GC (median): 0.00%
Time (mean ± σ): 44.865 ms ± 20.232 ms ┊ GC (mean ± σ): 27.83% ± 25.83%
▇█▁▅ ▇▁ ▁ ▁
████████▆▅▃▃▅▅▁█▃█▅▁▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▁█▆██▃▃█▃▆▁▃▃▃▁▁▁▃ ▃
25 ms Histogram: frequency by time 86.2 ms <
Memory estimate: 71.60 MiB, allocs estimate: 881218.
=#
(The current benchmark is a bit contrived to show scale of improvements, but we see improvements even in the smallest cases)
Benchmark Results
| master | 59827dc9d4b295... | t[master]/t[59827dc9d4b295...] | |
|---|---|---|---|
| Simple Pendulum/IIP/MultipleShooting(10, Tsit5; grid_coarsening = false) | 3.11 ± 0.068 ms | 3.07 ± 0.063 ms | 1.01 |
| Simple Pendulum/IIP/MultipleShooting(10, Tsit5; grid_coarsening = true) | 5.71 ± 0.12 ms | 3.06 ± 0.06 ms | 1.87 |
| Simple Pendulum/IIP/MultipleShooting(100, Tsit5; grid_coarsening = false) | 0.0817 ± 0.0056 s | 0.0771 ± 0.0039 s | 1.06 |
| Simple Pendulum/IIP/MultipleShooting(100, Tsit5; grid_coarsening = true) | 0.283 ± 0.0069 s | 0.0769 ± 0.0038 s | 3.68 |
| Simple Pendulum/IIP/Shooting(Tsit5()) | 0.282 ± 0.011 ms | 0.285 ± 0.0095 ms | 0.989 |
| Simple Pendulum/OOP/MultipleShooting(10, Tsit5; grid_coarsening = false) | 7.25 ± 0.33 ms | 6.93 ± 0.25 ms | 1.05 |
| Simple Pendulum/OOP/MultipleShooting(10, Tsit5; grid_coarsening = true) | 13 ± 6.4 ms | 7.03 ± 0.36 ms | 1.86 |
| Simple Pendulum/OOP/MultipleShooting(100, Tsit5; grid_coarsening = false) | 0.204 ± 0.0028 s | 0.19 ± 0.0033 s | 1.08 |
| Simple Pendulum/OOP/MultipleShooting(100, Tsit5; grid_coarsening = true) | 1.14 ± 0.0021 s | 0.19 ± 0.0029 s | 5.97 |
| Simple Pendulum/OOP/Shooting(Tsit5()) | 1.11 ± 0.072 ms | 1.1 ± 0.039 ms | 1.01 |
| time_to_load | 9.96 ± 0.034 s | 9.94 ± 0.04 s | 1 |
Benchmark Plots
A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR. Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).