DifferentialEquations.jl SYSTEM: show(lasterr) caused an error when using EnsembleGPUArray()

Hello,

I'm trying to solve and EnsembleProblem built over and SDEProblem using the GPU, but everytime I try I get an error I cannot understand.

Here is a simplified version of the model:


using DifferentialEquations, Distributions, DiffEqBayes, DiffEqGPU

# Model parameters

β = 0.01# infection rate
λ_R = 0.05 # inverse of transition time from  infected to recovered
λ_D = 0.83 # inverse of transition time from  infected to dead
σ_β = 0.01 
σ_R = 0.01 
σ_D = 0.01 

𝒫 = vcat([β, λ_R, λ_D,σ_β,σ_R, σ_D]...)


# regional contact matrix and regional population

## regional contact matrix
regional_all_contact_matrix = [3.45536   0.485314  0.506389  0.123002 ; 0.597721  2.11738   0.911374  0.323385 ; 0.906231  1.35041   1.60756   0.67411 ; 0.237902  0.432631  0.726488  0.979258] # 4x4 contact matrix

## regional population stratified by age
N= [723208 , 874150, 1330993, 1411928] # array of 4 elements, each of which representing the absolute amount of population in the corresponding age class.


# Initial conditions 
i₀ = 0.075 # fraction of initial infected people in every age class
I₀ = repeat([i₀],4)
S₀ = N.-I₀
R₀ = [0.0 for n in 1:length(N)]
D₀ = [0.0 for n in 1:length(N)]
D_tot₀ = [0.0 for n in 1:length(N)]
ℬ = vcat([S₀, I₀, R₀, D₀, D_tot₀]...) 

# Time 
final_time = 20
𝒯 = (1.0,final_time); 




function SIRD_ac!(du,u,p,t)  
    # Parameters to be calibrated
    β, λ_R, λ_D, _,_,_ = p

    # initialize this parameter (death probability stratified by age, taken from literature)
    
    δ₁, δ₂, δ₃, δ₄ = [0.003/100, 0.004/100, (0.015+0.030+0.064+0.213+0.718)/(5*100), (2.384+8.466+12.497+1.117)/(4*100)]
    δ = vcat(repeat([δ₁],1),repeat([δ₂],1),repeat([δ₃],1),repeat([δ₄],4-1-1-1))


    C = regional_all_contact_matrix 

    
    # State variables
    S = @view u[4*0+1:4*1]
    I = @view u[4*1+1:4*2]
    R = @view u[4*2+1:4*3]
    D = @view u[4*3+1:4*4]
    D_tot = @view u[4*4+1:4*5]

    # Differentials
    dS = @view du[4*0+1:4*1]
    dI = @view du[4*1+1:4*2]
    dR = @view du[4*2+1:4*3]
    dD = @view du[4*3+1:4*4]
    dD_tot = @view du[4*4+1:4*5]
    
    # Force of infection
    Λ = β*[sum([C[i,j]*I[j]/N[j] for j in 1:size(C)[1]]) for i in 1:size(C)[2]] 
    
    # System of equations
    @. dS = -Λ*S
    @. dI = Λ*S - ((1-δ)*λ_R + δ*λ_D)*I
    @. dR = λ_R*(1-δ)*I 
    @. dD = λ_D*δ*I
    @. dD_tot = dD[1]+dD[2]+dD[3]+dD[4]
    

end;

# define noise
function SIRD_ac_noise!(du,u,p,t)  
    # Parameters to be calibrated
    _,_,_, σ_β, σ_R, σ_D = p

    # initialize this parameter (death probability stratified by age, taken from literature)
    
    δ₁, δ₂, δ₃, δ₄ = [0.003/100, 0.004/100, (0.015+0.030+0.064+0.213+0.718)/(5*100), (2.384+8.466+12.497+1.117)/(4*100)]
    δ = vcat(repeat([δ₁],1),repeat([δ₂],1),repeat([δ₃],1),repeat([δ₄],4-1-1-1))


    C = regional_all_contact_matrix 
    
    

    
    # State variables
    S = @view u[4*0+1:4*1]
    I = @view u[4*1+1:4*2]
    R = @view u[4*2+1:4*3]
    D = @view u[4*3+1:4*4]
    D_tot = @view u[4*4+1:4*5]

    # Differentials
    dS = @view du[4*0+1:4*1]
    dI = @view du[4*1+1:4*2]
    dR = @view du[4*2+1:4*3]
    dD = @view du[4*3+1:4*4]
    dD_tot = @view du[4*4+1:4*5]
    
    # Force of infection
    Λ = rand(Normal(0.0, σ_β))*[sum([C[i,j]*I[j]/N[j] for j in 1:size(C)[1]]) for i in 1:size(C)[2]] 
    
    # System of equations
    @. dS = -Λ*S
    @. dI = Λ*S - ((1-δ)*rand(Normal( 0.0,σ_R)) + δ*rand(Normal( 0.0,σ_D)))*I
    @. dR = rand(Normal( 0.0,σ_R))*(1-δ)*I 
    @. dD = rand(Normal( 0.0,σ_D))*δ*I
    @. dD_tot = dD[1]+dD[2]+dD[3]+dD[4]
    

end;

# create problem and check it works
sde_problem = SDEProblem(SIRD_ac!,SIRD_ac_noise!,ℬ, 𝒯, 𝒫 )
solution = @time solve(sde_problem, saveat = 1:final_time); #14.041255 seconds (51.44 M allocations: 9.139 GiB, 7.41% gc time)

Then I buid the EnsembleProblem:

ens_problem = EnsembleProblem(sde_problem) #EnsembleProblem with problem SDEProblem

But solving it on GPU doesn't work:

sol = @time solve(ens_problem,EnsembleGPUArray();trajectories=10)

SYSTEM: show(lasterr) caused an error

Stacktrace:
 [1] check_ir!(::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget,CUDA.CUDACompilerParams}, ::Array{Tuple{String,Array{Base.StackTraces.StackFrame,1},Any},1}, ::LLVM.CallInst) at C:\Users\claud\.julia\packages\GPUCompiler\GKp4B\src\validation.jl:249
 [2] check_ir!(::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget,CUDA.CUDACompilerParams}, ::Array{Tuple{String,Array{Base.StackTraces.StackFrame,1},Any},1}, ::LLVM.Function) at C:\Users\claud\.julia\packages\GPUCompiler\GKp4B\src\validation.jl:140
 [3] check_ir!(::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget,CUDA.CUDACompilerParams}, ::Array{Tuple{String,Array{Base.StackTraces.StackFrame,1},Any},1}, ::LLVM.Module) at C:\Users\claud\.julia\packages\GPUCompiler\GKp4B\src\validation.jl:131
 [4] check_ir(::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget,CUDA.CUDACompilerParams}, ::LLVM.Module) at C:\Users\claud\.julia\packages\GPUCompiler\GKp4B\src\validation.jl:120
 [5] macro expansion at C:\Users\claud\.julia\packages\GPUCompiler\GKp4B\src\driver.jl:241 [inlined]
 [6] macro expansion at C:\Users\claud\.julia\packages\TimerOutputs\ZmKD7\src\TimerOutput.jl:206 [inlined]
 [7] codegen(::Symbol, ::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget,CUDA.CUDACompilerParams}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, strip::Bool, validate::Bool, only_entry::Bool) at C:\Users\claud\.julia\packages\GPUCompiler\GKp4B\src\driver.jl:239
 [8] compile(::Symbol, ::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget,CUDA.CUDACompilerParams}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, strip::Bool, validate::Bool, only_entry::Bool) at C:\Users\claud\.julia\packages\GPUCompiler\GKp4B\src\driver.jl:39
 [9] compile at C:\Users\claud\.julia\packages\GPUCompiler\GKp4B\src\driver.jl:35 [inlined]
 [10] _cufunction(::GPUCompiler.FunctionSpec{typeof(Cassette.overdub),Tuple{Cassette.Context{nametype(CUDACtx),KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize,KernelAbstractions.NDIteration.DynamicCheck,Nothing,CartesianIndices{1,Tuple{Base.OneTo{Int64}}},KernelAbstractions.NDIteration.NDRange{1,KernelAbstractions.NDIteration.DynamicSize,KernelAbstractions.NDIteration.DynamicSize,CartesianIndices{1,Tuple{Base.OneTo{Int64}}},CartesianIndices{1,Tuple{Base.OneTo{Int64}}}}},Nothing,KernelAbstractions.var"##PassType#253",Nothing,Cassette.DisableHooks},typeof(DiffEqGPU.gpu_gpu_kernel),typeof(SIRD_ac!),CUDA.CuDeviceArray{Float64,2,CUDA.AS.Global},CUDA.CuDeviceArray{Float64,2,CUDA.AS.Global},CUDA.CuDeviceArray{Float64,2,CUDA.AS.Global},Float64}}; kwargs::Base.Iterators.Pairs{Symbol,Nothing,Tuple{Symbol},NamedTuple{(:maxthreads,),Tuple{Nothing}}}) at C:\Users\claud\.julia\packages\CUDA\dZvbp\src\compiler\execution.jl:310
 [11] check_cache(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{typeof(Cassette.overdub),Tuple{Cassette.Context{nametype(CUDACtx),KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize,KernelAbstractions.NDIteration.DynamicCheck,Nothing,CartesianIndices{1,Tuple{Base.OneTo{Int64}}},KernelAbstractions.NDIteration.NDRange{1,KernelAbstractions.NDIteration.DynamicSize,KernelAbstractions.NDIteration.DynamicSize,CartesianIndices{1,Tuple{Base.OneTo{Int64}}},CartesianIndices{1,Tuple{Base.OneTo{Int64}}}}},Nothing,KernelAbstractions.var"##PassType#253",Nothing,Cassette.DisableHooks},typeof(DiffEqGPU.gpu_gpu_kernel),typeof(SIRD_ac!),CUDA.CuDeviceArray{Float64,2,CUDA.AS.Global},CUDA.CuDeviceArray{Float64,2,CUDA.AS.Global},CUDA.CuDeviceArray{Float64,2,CUDA.AS.Global},Float64}}, ::UInt64; kwargs::Base.Iterators.Pairs{Symbol,Nothing,Tuple{Symbol},NamedTuple{(:maxthreads,),Tuple{Nothing}}}) at C:\Users\claud\.julia\packages\GPUCompiler\GKp4B\src\cache.jl:24
 [12] gpu_gpu_kernel at .\none:0 [inlined]
 [13] cached_compilation(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{typeof(Cassette.overdub),Tuple{Cassette.Context{nametype(CUDACtx),KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize,KernelAbstractions.NDIteration.DynamicCheck,Nothing,CartesianIndices{1,Tuple{Base.OneTo{Int64}}},KernelAbstractions.NDIteration.NDRange{1,KernelAbstractions.NDIteration.DynamicSize,KernelAbstractions.NDIteration.DynamicSize,CartesianIndices{1,Tuple{Base.OneTo{Int64}}},CartesianIndices{1,Tuple{Base.OneTo{Int64}}}}},Nothing,KernelAbstractions.var"##PassType#253",Nothing,Cassette.DisableHooks},typeof(DiffEqGPU.gpu_gpu_kernel),typeof(SIRD_ac!),CUDA.CuDeviceArray{Float64,2,CUDA.AS.Global},CUDA.CuDeviceArray{Float64,2,CUDA.AS.Global},CUDA.CuDeviceArray{Float64,2,CUDA.AS.Global},Float64}}, ::UInt64; kwargs::Base.Iterators.Pairs{Symbol,Nothing,Tuple{Symbol},NamedTuple{(:maxthreads,),Tuple{Nothing}}}) at C:\Users\claud\.julia\packages\GPUCompiler\GKp4B\src\cache.jl:0
 [14] cufunction(::typeof(Cassette.overdub), ::Type{Tuple{Cassette.Context{nametype(CUDACtx),KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize,KernelAbstractions.NDIteration.DynamicCheck,Nothing,CartesianIndices{1,Tuple{Base.OneTo{Int64}}},KernelAbstractions.NDIteration.NDRange{1,KernelAbstractions.NDIteration.DynamicSize,KernelAbstractions.NDIteration.DynamicSize,CartesianIndices{1,Tuple{Base.OneTo{Int64}}},CartesianIndices{1,Tuple{Base.OneTo{Int64}}}}},Nothing,KernelAbstractions.var"##PassType#253",Nothing,Cassette.DisableHooks},typeof(DiffEqGPU.gpu_gpu_kernel),typeof(SIRD_ac!),CUDA.CuDeviceArray{Float64,2,CUDA.AS.Global},CUDA.CuDeviceArray{Float64,2,CUDA.AS.Global},CUDA.CuDeviceArray{Float64,2,CUDA.AS.Global},Float64}}; name::String, kwargs::Base.Iterators.Pairs{Symbol,Nothing,Tuple{Symbol},NamedTuple{(:maxthreads,),Tuple{Nothing}}}) at C:\Users\claud\.julia\packages\CUDA\dZvbp\src\compiler\execution.jl:298
 [15] macro expansion at C:\Users\claud\.julia\packages\CUDA\dZvbp\src\compiler\execution.jl:109 [inlined]
 [16] (::KernelAbstractions.Kernel{KernelAbstractions.CUDADevice,KernelAbstractions.NDIteration.DynamicSize,KernelAbstractions.NDIteration.DynamicSize,typeof(DiffEqGPU.gpu_gpu_kernel)})(::Function, ::Vararg{Any,N} where N; ndrange::Int64, dependencies::KernelAbstractions.CudaEvent, workgroupsize::Int64, progress::Function) at C:\Users\claud\.julia\packages\KernelAbstractions\jAutM\src\backends\cuda.jl:185
 [17] #42 at C:\Users\claud\.julia\packages\DiffEqGPU\TnpRW\src\DiffEqGPU.jl:317 [inlined]
 [18] SDEFunction at C:\Users\claud\.julia\packages\DiffEqBase\wK2gH\src\diffeqfunction.jl:286 [inlined]
 [19] sde_determine_initdt(::CUDA.CuArray{Float64,2}, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::typeof(DiffEqGPU.diffeqgpunorm), ::SDEProblem{CUDA.CuArray{Float64,2},Tuple{Float64,Float64},true,CUDA.CuArray{Float64,2},Nothing,SDEFunction{true,DiffEqGPU.var"#42#47"{typeof(SIRD_ac!)},DiffEqGPU.var"#43#48"{typeof(SIRD_ac_noise!)},LinearAlgebra.UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},DiffEqGPU.var"#43#48"{typeof(SIRD_ac_noise!)},Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},Nothing}, ::Rational{Int64}, ::StochasticDiffEq.SDEIntegrator{SOSRI,true,CUDA.CuArray{Float64,2},Float64,Float64,Float64,CUDA.CuArray{Float64,2},Float64,Float64,Float64,NoiseProcess{Float64,3,Float64,CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2},Array{CUDA.CuArray{Float64,2},1},typeof(DiffEqNoiseProcess.INPLACE_WHITE_NOISE_DIST),typeof(DiffEqNoiseProcess.INPLACE_WHITE_NOISE_BRIDGE),true,ResettableStacks.ResettableStack{Tuple{Float64,CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2}},true},ResettableStacks.ResettableStack{Tuple{Float64,CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2}},true},RSWM{Float64},Nothing,RandomNumbers.Xorshifts.Xoroshiro128Plus},Nothing,CUDA.CuArray{Float64,2},RODESolution{Float64,3,Array{CUDA.CuArray{Float64,2},1},Nothing,Nothing,Array{Float64,1},NoiseProcess{Float64,3,Float64,CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2},Array{CUDA.CuArray{Float64,2},1},typeof(DiffEqNoiseProcess.INPLACE_WHITE_NOISE_DIST),typeof(DiffEqNoiseProcess.INPLACE_WHITE_NOISE_BRIDGE),true,ResettableStacks.ResettableStack{Tuple{Float64,CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2}},true},ResettableStacks.ResettableStack{Tuple{Float64,CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2}},true},RSWM{Float64},Nothing,RandomNumbers.Xorshifts.Xoroshiro128Plus},SDEProblem{CUDA.CuArray{Float64,2},Tuple{Float64,Float64},true,CUDA.CuArray{Float64,2},Nothing,SDEFunction{true,DiffEqGPU.var"#42#47"{typeof(SIRD_ac!)},DiffEqGPU.var"#43#48"{typeof(SIRD_ac_noise!)},LinearAlgebra.UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},DiffEqGPU.var"#43#48"{typeof(SIRD_ac_noise!)},Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},Nothing},SOSRI,StochasticDiffEq.LinearInterpolationData{Array{CUDA.CuArray{Float64,2},1},Array{Float64,1}},DiffEqBase.DEStats},StochasticDiffEq.FourStageSRICache{CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2},StochasticDiffEq.FourStageSRIConstantCache{Float64,Float64},CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2}},SDEFunction{true,DiffEqGPU.var"#42#47"{typeof(SIRD_ac!)},DiffEqGPU.var"#43#48"{typeof(SIRD_ac_noise!)},LinearAlgebra.UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},DiffEqGPU.var"#43#48"{typeof(SIRD_ac_noise!)},Nothing,StochasticDiffEq.SDEOptions{Float64,Float64,typeof(DiffEqGPU.diffeqgpunorm),CallbackSet{Tuple{},Tuple{}},typeof(DiffEqBase.ODE_DEFAULT_ISOUTOFDOMAIN),typeof(DiffEqBase.ODE_DEFAULT_PROG_MESSAGE),DiffEqGPU.var"#6#12",DataStructures.BinaryHeap{Float64,Base.Order.ForwardOrdering},DataStructures.BinaryHeap{Float64,Base.Order.ForwardOrdering},Nothing,Nothing,Int64,Float64,Float64,Float64,Tuple{},Tuple{},Tuple{}},Nothing,Float64,Nothing,Nothing}) at C:\Users\claud\.julia\packages\StochasticDiffEq\Abmgl\src\initdt.jl:30
 [20] auto_dt_reset! at C:\Users\claud\.julia\packages\StochasticDiffEq\Abmgl\src\integrators\integrator_interface.jl:353 [inlined]
 [21] handle_dt!(::StochasticDiffEq.SDEIntegrator{SOSRI,true,CUDA.CuArray{Float64,2},Float64,Float64,Float64,CUDA.CuArray{Float64,2},Float64,Float64,Float64,NoiseProcess{Float64,3,Float64,CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2},Array{CUDA.CuArray{Float64,2},1},typeof(DiffEqNoiseProcess.INPLACE_WHITE_NOISE_DIST),typeof(DiffEqNoiseProcess.INPLACE_WHITE_NOISE_BRIDGE),true,ResettableStacks.ResettableStack{Tuple{Float64,CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2}},true},ResettableStacks.ResettableStack{Tuple{Float64,CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2}},true},RSWM{Float64},Nothing,RandomNumbers.Xorshifts.Xoroshiro128Plus},Nothing,CUDA.CuArray{Float64,2},RODESolution{Float64,3,Array{CUDA.CuArray{Float64,2},1},Nothing,Nothing,Array{Float64,1},NoiseProcess{Float64,3,Float64,CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2},Array{CUDA.CuArray{Float64,2},1},typeof(DiffEqNoiseProcess.INPLACE_WHITE_NOISE_DIST),typeof(DiffEqNoiseProcess.INPLACE_WHITE_NOISE_BRIDGE),true,ResettableStacks.ResettableStack{Tuple{Float64,CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2}},true},ResettableStacks.ResettableStack{Tuple{Float64,CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2}},true},RSWM{Float64},Nothing,RandomNumbers.Xorshifts.Xoroshiro128Plus},SDEProblem{CUDA.CuArray{Float64,2},Tuple{Float64,Float64},true,CUDA.CuArray{Float64,2},Nothing,SDEFunction{true,DiffEqGPU.var"#42#47"{typeof(SIRD_ac!)},DiffEqGPU.var"#43#48"{typeof(SIRD_ac_noise!)},LinearAlgebra.UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},DiffEqGPU.var"#43#48"{typeof(SIRD_ac_noise!)},Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},Nothing},SOSRI,StochasticDiffEq.LinearInterpolationData{Array{CUDA.CuArray{Float64,2},1},Array{Float64,1}},DiffEqBase.DEStats},StochasticDiffEq.FourStageSRICache{CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2},StochasticDiffEq.FourStageSRIConstantCache{Float64,Float64},CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2},CUDA.CuArray{Float64,2}},SDEFunction{true,DiffEqGPU.var"#42#47"{typeof(SIRD_ac!)},DiffEqGPU.var"#43#48"{typeof(SIRD_ac_noise!)},LinearAlgebra.UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},DiffEqGPU.var"#43#48"{typeof(SIRD_ac_noise!)},Nothing,StochasticDiffEq.SDEOptions{Float64,Float64,typeof(DiffEqGPU.diffeqgpunorm),CallbackSet{Tuple{},Tuple{}},typeof(DiffEqBase.ODE_DEFAULT_ISOUTOFDOMAIN),typeof(DiffEqBase.ODE_DEFAULT_PROG_MESSAGE),DiffEqGPU.var"#6#12",DataStructures.BinaryHeap{Float64,Base.Order.ForwardOrdering},DataStructures.BinaryHeap{Float64,Base.Order.ForwardOrdering},Nothing,Nothing,Int64,Float64,Float64,Float64,Tuple{},Tuple{},Tuple{}},Nothing,Float64,Nothing,Nothing}) at C:\Users\claud\.julia\packages\StochasticDiffEq\Abmgl\src\solve.jl:599
 [22] __init(::SDEProblem{CUDA.CuArray{Float64,2},Tuple{Float64,Float64},true,CUDA.CuArray{Float64,2},Nothing,SDEFunction{true,DiffEqGPU.var"#42#47"{typeof(SIRD_ac!)},DiffEqGPU.var"#43#48"{typeof(SIRD_ac_noise!)},LinearAlgebra.UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},DiffEqGPU.var"#43#48"{typeof(SIRD_ac_noise!)},Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},Nothing}, ::SOSRI, ::Array{Any,1}, ::Array{Any,1}, ::Type{T} where T, ::Type{Val{true}}; saveat::Tuple{}, tstops::Tuple{}, d_discontinuities::Tuple{}, save_idxs::Nothing, save_everystep::Bool, save_noise::Bool, save_on::Bool, save_start::Bool, save_end::Bool, callback::Nothing, dense::Bool, calck::Bool, dt::Float64, adaptive::Bool, gamma::Rational{Int64}, abstol::Nothing, reltol::Nothing, qmax::Rational{Int64}, qmin::Rational{Int64}, qoldinit::Rational{Int64}, fullnormalize::Bool, failfactor::Int64, beta2::Rational{Int64}, beta1::Rational{Int64}, delta::Rational{Int64}, maxiters::Int64, dtmax::Float64, dtmin::Float64, internalnorm::typeof(DiffEqGPU.diffeqgpunorm), isoutofdomain::typeof(DiffEqBase.ODE_DEFAULT_ISOUTOFDOMAIN), unstable_check::DiffEqGPU.var"#6#12", verbose::Bool, force_dtmin::Bool, timeseries_errors::Bool, dense_errors::Bool, advance_to_tstop::Bool, stop_at_next_tstop::Bool, initialize_save::Bool, progress::Bool, progress_steps::Int64, progress_name::String, progress_message::typeof(DiffEqBase.ODE_DEFAULT_PROG_MESSAGE), userdata::Nothing, initialize_integrator::Bool, seed::UInt64, alias_u0::Bool, alias_jumps::Bool, kwargs::Base.Iterators.Pairs{Symbol,Bool,Tuple{Symbol},NamedTuple{(:default_set,),Tuple{Bool}}}) at C:\Users\claud\.julia\packages\StochasticDiffEq\Abmgl\src\solve.jl:552
 [23] #__solve#97 at C:\Users\claud\.julia\packages\StochasticDiffEq\Abmgl\src\solve.jl:6 [inlined]
 [24] __solve(::SDEProblem{CUDA.CuArray{Float64,2},Tuple{Float64,Float64},true,CUDA.CuArray{Float64,2},Nothing,SDEFunction{true,DiffEqGPU.var"#42#47"{typeof(SIRD_ac!)},DiffEqGPU.var"#43#48"{typeof(SIRD_ac_noise!)},LinearAlgebra.UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},DiffEqGPU.var"#43#48"{typeof(SIRD_ac_noise!)},Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},Nothing}, ::Nothing; default_set::Bool, kwargs::Base.Iterators.Pairs{Symbol,Any,Tuple{Symbol,Symbol,Symbol},NamedTuple{(:callback, :internalnorm, :unstable_check),Tuple{Nothing,typeof(DiffEqGPU.diffeqgpunorm),DiffEqGPU.var"#6#12"}}}) at C:\Users\claud\.julia\packages\DifferentialEquations\fpohE\src\default_solve.jl:7
 [25] #solve_call#455 at C:\Users\claud\.julia\packages\DiffEqBase\wK2gH\src\solve.jl:65 [inlined]
 [26] #solve_up#457 at C:\Users\claud\.julia\packages\DiffEqBase\wK2gH\src\solve.jl:92 [inlined]
 [27] #solve#456 at C:\Users\claud\.julia\packages\DiffEqBase\wK2gH\src\solve.jl:74 [inlined]
 [28] batch_solve(::EnsembleProblem{SDEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},Nothing,SDEFunction{true,typeof(SIRD_ac!),typeof(SIRD_ac_noise!),LinearAlgebra.UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},typeof(SIRD_ac_noise!),Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},Nothing},typeof(DiffEqBase.DEFAULT_PROB_FUNC),typeof(DiffEqBase.DEFAULT_OUTPUT_FUNC),typeof(DiffEqBase.DEFAULT_REDUCTION),Nothing}, ::Nothing, ::EnsembleGPUArray, ::UnitRange{Int64}; kwargs::Base.Iterators.Pairs{Symbol,DiffEqGPU.var"#6#12",Tuple{Symbol},NamedTuple{(:unstable_check,),Tuple{DiffEqGPU.var"#6#12"}}}) at C:\Users\claud\.julia\packages\DiffEqGPU\TnpRW\src\DiffEqGPU.jl:238
 [29] macro expansion at .\timing.jl:233 [inlined]
 [30] __solve(::EnsembleProblem{SDEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},Nothing,SDEFunction{true,typeof(SIRD_ac!),typeof(SIRD_ac_noise!),LinearAlgebra.UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},typeof(SIRD_ac_noise!),Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},Nothing},typeof(DiffEqBase.DEFAULT_PROB_FUNC),typeof(DiffEqBase.DEFAULT_OUTPUT_FUNC),typeof(DiffEqBase.DEFAULT_REDUCTION),Nothing}, ::Nothing, ::EnsembleGPUArray; trajectories::Int64, batch_size::Int64, unstable_check::Function, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\claud\.julia\packages\DiffEqGPU\TnpRW\src\DiffEqGPU.jl:128
 [31] __solve(::EnsembleProblem{SDEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},Nothing,SDEFunction{true,typeof(SIRD_ac!),typeof(SIRD_ac_noise!),LinearAlgebra.UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},typeof(SIRD_ac_noise!),Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},Nothing},typeof(DiffEqBase.DEFAULT_PROB_FUNC),typeof(DiffEqBase.DEFAULT_OUTPUT_FUNC),typeof(DiffEqBase.DEFAULT_REDUCTION),Nothing}, ::EnsembleGPUArray; kwargs::Base.Iterators.Pairs{Symbol,Int64,Tuple{Symbol},NamedTuple{(:trajectories,),Tuple{Int64}}}) at C:\Users\claud\.julia\packages\DiffEqBase\wK2gH\src\ensemble\basic_ensemble_solve.jl:87
 [32] solve(::EnsembleProblem{SDEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},Nothing,SDEFunction{true,typeof(SIRD_ac!),typeof(SIRD_ac_noise!),LinearAlgebra.UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},typeof(SIRD_ac_noise!),Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},Nothing},typeof(DiffEqBase.DEFAULT_PROB_FUNC),typeof(DiffEqBase.DEFAULT_OUTPUT_FUNC),typeof(DiffEqBase.DEFAULT_REDUCTION),Nothing}, ::EnsembleGPUArray; kwargs::Base.Iterators.Pairs{Symbol,Int64,Tuple{Symbol},NamedTuple{(:trajectories,),Tuple{Int64}}}) at C:\Users\claud\.julia\packages\DiffEqBase\wK2gH\src\solve.jl:100
 [33] top-level scope at .\timing.jl:174 [inlined]
 [34] top-level scope at .\In[4]:0
 [35] include_string(::Function, ::Module, ::String, ::String) at .\loading.jl:1091
┌ Warning: Pkg.installed() is deprecated
└ @ Pkg C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Pkg\src\Pkg.jl:554

So how can I make it work?

environment:

(computationalEpi) pkg> st
Status `E:\IlMIoDrive\magistrale\2anno\primo_periodo\computationalEpi\Project.toml`
  [ff3c4d4f] AutoOptimize v0.1.0 `https://github.com/SciML/AutoOptimize.jl#master`
  [8f4d0f93] Conda v1.5.0
  [071ae1c0] DiffEqGPU v1.8.0
  [0c46a032] DifferentialEquations v6.15.0
  [7073ff75] IJulia v1.23.1
  [961ee093] ModelingToolkit v4.1.0
  [d330b81b] PyPlot v2.9.0

Thank you very much

Dec 05 '20 15:12 ClaudMor

Your system is allocating. You'll need to write a non-allocating version first (or have MTK generate it).

Dec 05 '20 16:12 ChrisRackauckas

Your system is allocating. You'll need to write a non-allocating version first (or have MTK generate it).

Hello,

Thanks @ChrisRackauckas for your answer. As already noted in this issue, I am not able to modelingtoolkitize the SDEProblem. Is there a way out?

PS: If I somehow manage not to allocate the parameters ( i.e. β, λ_R, λ_D,σ_β, σ_R, σ_D), δ, C and Λ inside the models, but leaving the rest as it is, would it work?

Dec 05 '20 16:12 ClaudMor

Generate intermediate tuples instead of arrays, i.e.

δ₁, δ₂, δ₃, δ₄ = (0.003/100, 0.004/100, (0.015+0.030+0.064+0.213+0.718)/(5*100), (2.384+8.466+12.497+1.117)/(4*100))

This code:

δ = vcat(repeat([δ₁],1),repeat([δ₂],1),repeat([δ₃],1),repeat([δ₄],4-1-1-1))

is a similar issue: get rid of the allocated arrays. And [sum([C[i,j]*I[j]/N[j] for j in 1:size(C)[1]]) for i in 1:size(C)[2]] .

But those are easy. Grabbing a global defined on the CPU (C = regional_all_contact_matrix) will never work on the GPU because that object is not defined there. So you'll want to @SMatrix [3.45536 0.485314 0.506389 0.123002 ; 0.597721 2.11738 0.911374 0.323385 ; 0.906231 1.35041 1.60756 0.67411 ; 0.237902 0.432631 0.726488 0.979258] and move this into the function itself so it optimizes and fully decomposes.

With those changes you should be good. I'll look at the modelingtoolkitization because it should work on this kind of problem, and the code changes it would do are along these lines + a flattening.

Dec 12 '20 04:12 ChrisRackauckas

DifferentialEquations.jl DifferentialEquations.jl copied to clipboard

SYSTEM: show(lasterr) caused an error when using EnsembleGPUArray()

DifferentialEquations.jl
DifferentialEquations.jl copied to clipboard