Reverse Integration CI
Now that we have some more CI resources, lets get reverse CI setup for some integration tests. We don't have infinite resources, but let's try to get some useful things going.
Specifically my thoughts here are to have CPU reverse integration CI, on 32-core machines. Each integration can have one of these runners [which do need to complete in some reasonable time, tbd]. These won't be considered blocking for PRs (unless they can spit out MWE's for examples of the failure), but can be used to early catch failures (and hopefully prevent any issues before the pop up, and provide some nicer notions of stability).
- [x] Lux @avik-pal could you take a look? (done in #2569)
- [ ] Flux @CarloLucibello, @mcabbott could you take a look?
- [ ] Turing @mhauru could you take a look?
- [x] SciML @ChrisRackauckas could you take a look? (done in #2593)
- [ ] Comrade @ptiede could you take a look? [assuming theres some interesting CPU-only tests]
- [x] DifferentiationInterface @gdalle could you take a look? (done in #2531)
- [ ] Molly @jgreener64 could you take a look? [assuming theres some interesting CPU-only tests] (#2728)
- [ ] Oceananigans @glwagner could you take a look?
- [ ] @MilesCranmer I remember you mentioning something else
- [ ] Chmy/friends @luraess could you take a look? [assuming theres some interesting CPU-only tests]
- [x] Distributions (#2617)
- [x] KernelAbstractions (#2709)
- [ ] MPI (#518)
Very good news! How do you want to do this, one yml file for all packages? GitHub actions or Buildkite? What is your idea of "reasonable time"?
github actions. Up to either, but currently we have one yml for all integration tests. not sure about reasonable time yet, but it should for sure complete before the rest of Enzyme proper CI, and not hog all the CI VMs
also cc @giordano
One question is: do you want to just execute the full downstream test suite for each package (which would be comprehensive but may be very compute intensive and run loads of unrelated tests), or just have smaller test scripts to be included here which represent relevant workloads?
I think we want to have smaller test scripts that are isolated to just enzyme-related tests [and to the extent possible, direct calls to Enzyme fns].
This is coming from two different constraints:
- obviously the downstream package devs want it to be as high-level/comprehensive as possible, to make it easier for them
- the enzyme devs [particularly new ones] need the tests to be as close to the code they're working on to understand what happened and how to fix
my middle ground idea here (subject to debate/change) is we setup integration tests that are exclusive to the enzyme usage (pinned to specific versions of integration packages, which can be updated), but not marked blocking unless the potentially broken tests are sufficiently direct uses of enzyme/minimized, that they are actionable from the dev side (but still equally helpful to realize a change could be problematic and still likely get fixed during the PR)
Yeah, I personally think dedicated scripts which reproduce relevant workloads would be best from the Enzyme (and CI usage) point of view, of course this requires more work for the other devs who need to maintain extra integration tests (but the benefit is reducing unintended breakage).
Im terms of setup in this scheme I'd suggest having a single workflow which parametrises the downstream package and which runs the relevant integration script. Then, we can have a dedicated directory for each package (something like test/downstream/<NAME>), which contains the Project.toml, which uses the sources section to point to the top-level directory of this repo, and the entry point of the tests (runtests.jl?).
All I wrote above is up to debate, these are my suggestions!
Great idea.
Another way to do this is to have the Enzyme tests in the packages themselves where it is easier to keep them up to date. A subset of tests can then be run during Enzyme CI using the GROUP environmental variable, like the Zygote tests do: https://github.com/FluxML/Zygote.jl/blob/master/.github/workflows/Downstream.yml#L21-L26.
I think having the test code be in this repo is sufficiently beneficial for folks who want to try to debug and reproduce failures from here.
Direct adjoints of the ODE solver:
using Enzyme, OrdinaryDiffEqTsit5, StaticArrays, DiffEqBase, ForwardDiff, Test
function lorenz!(du, u, p, t)
du[1] = 10.0(u[2] - u[1])
du[2] = u[1] * (28.0 - u[3]) - u[2]
du[3] = u[1] * u[2] - (8 / 3) * u[3]
end
const _saveat = SA[0.0,0.25,0.5,0.75,1.0,1.25,1.5,1.75,2.0,2.25,2.5,2.75,3.0]
function f_dt(y::Array{Float64}, u0::Array{Float64})
tspan = (0.0, 3.0)
prob = ODEProblem{true, SciMLBase.FullSpecialize}(lorenz!, u0, tspan)
sol = DiffEqBase.solve(prob, Tsit5(), saveat = _saveat, sensealg = DiffEqBase.SensitivityADPassThrough(), abstol=1e-12, reltol=1e-12)
y .= sol[1,:]
return nothing
end;
function f_dt(u0)
tspan = (0.0, 3.0)
prob = ODEProblem{true, SciMLBase.FullSpecialize}(lorenz!, u0, tspan)
sol = DiffEqBase.solve(prob, Tsit5(), saveat = _saveat, sensealg = DiffEqBase.SensitivityADPassThrough(), abstol=1e-12, reltol=1e-12)
sol[1,:]
end;
u0 = [1.0; 0.0; 0.0]
fdj = ForwardDiff.jacobian(f_dt, u0)
ezj = stack(map(1:3) do i
d_u0 = zeros(3)
dy = zeros(13)
y = zeros(13)
d_u0[i] = 1.0
Enzyme.autodiff(Forward, f_dt, Duplicated(y, dy), Duplicated(u0, d_u0));
dy
end)
@test ezj ≈ fdj
function f_dt2(u0)
tspan = (0.0, 3.0)
prob = ODEProblem{true, SciMLBase.FullSpecialize}(lorenz!, u0, tspan)
sol = DiffEqBase.solve(prob, Tsit5(), dt=0.1, saveat = _saveat, sensealg = DiffEqBase.SensitivityADPassThrough(), abstol=1e-12, reltol=1e-12)
sum(sol[1,:])
end
fdg = ForwardDiff.gradient(f_dt2, u0)
d_u0 = zeros(3)
Enzyme.autodiff(Reverse, f_dt2, Active, Duplicated(u0, d_u0));
@test d_u0 ≈ fdg
is a decently hard set. Then there's the adjoint rules. https://github.com/SciML/SciMLSensitivity.jl/blob/master/test/alternative_ad_frontend.jl
Taking the first one can be a 99% indictor:
using OrdinaryDiffEq, SciMLSensitivity, Zygote, Enzyme,
using Test
Enzyme.API.typeWarning!(false)
odef(du, u, p, t) = du .= u .* p
const prob = ODEProblem(odef, [2.0], (0.0, 1.0), [3.0])
struct senseloss0{T}
sense::T
end
function (f::senseloss0)(u0p)
prob = ODEProblem{true}(odef, u0p[1:1], (0.0, 1.0), u0p[2:2])
sum(solve(prob, Tsit5(), abstol = 1e-12, reltol = 1e-12, saveat = 0.1))
end
u0p = [2.0, 3.0]
du0p = zeros(2)
dup = Zygote.gradient(senseloss0(InterpolatingAdjoint()), u0p)[1]
Enzyme.autodiff(Reverse, senseloss0(InterpolatingAdjoint()), Active, Duplicated(u0p, du0p))
@test du0p ≈ dup
Note that it's using Enzyme for the VJP in there too, so it's double testing Enzyme. The handling of the linear solvers is delicate:
https://github.com/SciML/LinearSolve.jl/blob/main/test/nopre/enzyme.jl
Maybe take https://github.com/SciML/LinearSolve.jl/blob/main/test/nopre/enzyme.jl :
using Enzyme, ForwardDiff
using LinearSolve, LinearAlgebra, Test
n = 4
A = rand(n, n);
dA = zeros(n, n);
b1 = rand(n);
db1 = zeros(n);
function f(A, b1; alg = LUFactorization())
prob = LinearProblem(A, b1)
sol1 = solve(prob, alg)
s1 = sol1.u
norm(s1)
end
f(A, b1) # Uses BLAS
Enzyme.autodiff(Reverse, f, Duplicated(copy(A), dA), Duplicated(copy(b1), db1))
dA2 = ForwardDiff.gradient(x -> f(x, eltype(x).(b1)), copy(A))
db12 = ForwardDiff.gradient(x -> f(eltype(x).(A), x), copy(b1))
@test dA ≈ dA2
@test db1 ≈ db12
Tagging @penelopeysm as she's now dealing with AD stuff in Turing.jl.
@ChrisRackauckas can you make whatever tests you think relevant as a PR?
Is there a PR that shows how you'd like it setup?
#2531
Seems like we should wait for that to merge then?
@ChrisRackauckas the other PRs have merged if you want to give it a go?
@wsmoses I'll start working on this in early December. Sorry for the very long delay.