BoundaryValueDiffEq.jl
BoundaryValueDiffEq.jl copied to clipboard
Support Enzyme and Mooncake in AD backends
Part of #288
Still need to figure out the failure on interpolation-based boundary condition BVP, but in general Enzyme and Mooncake are working now
Seems there are still some thing to do:
julia> @benchmark sol = solve(prob, MIRK4(; jac_alg = jac_alg_forwarddiff), dt = 0.05)
BenchmarkTools.Trial: 7002 samples with 1 evaluation per sample.
Range (min … max): 378.708 μs … 444.849 ms ┊ GC (min … max): 0.00% … 97.30%
Time (median): 433.500 μs ┊ GC (median): 0.00%
Time (mean ± σ): 767.981 μs ± 5.955 ms ┊ GC (mean ± σ): 21.83% ± 10.28%
█▃ ▁
███▇▆▆▅▅▅▅▅▅▄▄▄▁▄▁▃▅▁▁▄▁▄▁▃▃▃▃▁▄▄▁▃▁▁▃▃▄▁▁▁▁▄▃▃▄▄▅▆▅▅▄▄▁▃▁▃▅▅ █
379 μs Histogram: log(frequency) by time 7.72 ms <
Memory estimate: 1.46 MiB, allocs estimate: 27500.
julia> @benchmark sol = solve(prob, MIRK4(; jac_alg = jac_alg_enzyme), dt = 0.05)
BenchmarkTools.Trial: 3252 samples with 1 evaluation per sample.
Range (min … max): 1.227 ms … 32.662 ms ┊ GC (min … max): 0.00% … 95.57%
Time (median): 1.344 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.536 ms ± 1.057 ms ┊ GC (mean ± σ): 11.47% ± 14.84%
▄█▅▃▁
█████▇▅▅▄▅▃▃▄▃▃▃▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▆▆▅▇▇▇▇▆▆▅▆▆▅▃▅ █
1.23 ms Histogram: log(frequency) by time 5.3 ms <
Memory estimate: 2.43 MiB, allocs estimate: 27517.
Benchmark Results
| master | cf39fc8193920c... | master / cf39fc8193920c... | |
|---|---|---|---|
| Simple Pendulum/IIP/BoundaryValueDiffEqMIRK.MIRK2() | 1.14 ± 0.023 s | 1.13 ± 0.02 s | 1 |
| Simple Pendulum/IIP/BoundaryValueDiffEqMIRK.MIRK3() | 16.1 ± 0.77 ms | 15.9 ± 0.78 ms | 1.01 |
| Simple Pendulum/IIP/BoundaryValueDiffEqMIRK.MIRK4() | 3.02 ± 0.21 ms | 3 ± 0.25 ms | 1.01 |
| Simple Pendulum/IIP/BoundaryValueDiffEqMIRK.MIRK5() | 8.78 ± 0.85 ms | 8.75 ± 0.91 ms | 1 |
| Simple Pendulum/IIP/BoundaryValueDiffEqMIRK.MIRK6() | 1.55 ± 0.27 ms | 1.54 ± 0.23 ms | 1.01 |
| Simple Pendulum/IIP/MultipleShooting(10, Tsit5; grid_coarsening = false) | 1.88 ± 0.65 ms | 1.87 ± 0.66 ms | 1 |
| Simple Pendulum/IIP/MultipleShooting(10, Tsit5; grid_coarsening = true) | 3.18 ± 1 ms | 3.16 ± 1 ms | 1 |
| Simple Pendulum/IIP/MultipleShooting(100, Tsit5; grid_coarsening = false) | 0.0679 ± 0.017 s | 0.0669 ± 0.019 s | 1.01 |
| Simple Pendulum/IIP/MultipleShooting(100, Tsit5; grid_coarsening = true) | 0.0813 ± 0.022 s | 0.0798 ± 0.02 s | 1.02 |
| Simple Pendulum/IIP/Shooting(Tsit5()) | 0.248 ± 0.074 ms | 0.251 ± 0.074 ms | 0.988 |
| Simple Pendulum/OOP/BoundaryValueDiffEqMIRK.MIRK2() | 1.31 ± 0.011 s | 1.29 ± 0.0079 s | 1.02 |
| Simple Pendulum/OOP/BoundaryValueDiffEqMIRK.MIRK3() | 19.2 ± 6.3 ms | 18.9 ± 5.9 ms | 1.02 |
| Simple Pendulum/OOP/BoundaryValueDiffEqMIRK.MIRK4() | 3.57 ± 0.24 ms | 3.46 ± 0.19 ms | 1.03 |
| Simple Pendulum/OOP/BoundaryValueDiffEqMIRK.MIRK5() | 10.5 ± 0.91 ms | 10.3 ± 1.2 ms | 1.03 |
| Simple Pendulum/OOP/BoundaryValueDiffEqMIRK.MIRK6() | 1.82 ± 0.19 ms | 1.8 ± 0.18 ms | 1.01 |
| Simple Pendulum/OOP/MultipleShooting(10, Tsit5; grid_coarsening = false) | 3.55 ± 2.9 ms | 3.61 ± 2.9 ms | 0.982 |
| Simple Pendulum/OOP/MultipleShooting(10, Tsit5; grid_coarsening = true) | 6.05 ± 4.9 ms | 6.11 ± 5 ms | 0.991 |
| Simple Pendulum/OOP/MultipleShooting(100, Tsit5; grid_coarsening = false) | 0.12 ± 0.017 s | 0.116 ± 0.017 s | 1.03 |
| Simple Pendulum/OOP/MultipleShooting(100, Tsit5; grid_coarsening = true) | 0.144 ± 0.03 s | 0.145 ± 0.025 s | 0.997 |
| Simple Pendulum/OOP/Shooting(Tsit5()) | 0.653 ± 0.092 ms | 0.639 ± 0.051 ms | 1.02 |
| time_to_load | 5.08 ± 0.032 s | 5.12 ± 0.12 s | 0.992 |
Benchmark Plots
A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR. Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).
Looks good now, just need a patch from FastAlmostBandedMatrices.jl for the sparse AD of BC and then we are good to go.
Registering
🎉