NeuralPDE.jl
NeuralPDE.jl copied to clipboard
[WIP] ForwardDiff AD
What's the speed difference? High order?
Not very clear with the 3rd order ODE about speed. but the accuracy of the solution is clearly highe. The numerical method is fast because it is not accurate. That is, with approximately the same execution speed AD and numeric for 3rd order, the accuracy with AD is higher
AD BFGS 7.006 s (8964020 allocations: 13.38 GiB) ADAM 9.114 s (12620139 allocations: 18.74 GiB)
numeric BFGS 5.260 s (6739726 allocations: 10.56 GiB) ADAM 9.743 s (11914038 allocations: 18.64 GiB)
Cool, that's what I expected. What about on KS?
There require 2 dim. for KS. I suppose that it will work well with AD. I will write a hardcode 2 dim version tomorrow for presentation but actually it parser issue. There need to pass variables to the body of the derivative function.
function inner_parser_derivative(f,vars,dervars_num, order)
der_vars = vars[dervars_num[order]]
if order > 1
expr = :(ForwardDiff.derivative($der_vars->$(inner_parser_derivative(f,vars,dervars_num, (order-1))),$der_vars))
return expr
else
expr = :(ForwardDiff.derivative($der_vars->$f($(vars...),θ),$der_vars)[1])
return expr
end
end
function parser_derivative(f,vars,dervars_num)
order = size(dervars_num)[1]
:(($(vars...),θ) -> $(inner_parser_derivative(f,vars,dervars_num, order)))
end
function parser_derivative_broadcast(vars,dervars_num,expr_der_func)
# der_func =@RuntimeGeneratedFunction(expr_der_func)
:(($(vars...),θ) -> $(der_func.($(vars...))))
end
It's a scaling thing. AD should scale as the power in the polynomial while numerical won't. AD should have worse scaling for higher order AD (contrary to what most people would think! This is a case where AD is strictly worse than numerical! I mention it in the paper). This would be a very good thing to note. It follows from the discussion in Griewank's AD book.
So far, AD is significantly slower than the numerical derivative.
Poisson's eq, 2D maxiters=100 AD BFGS 39.180 s (33577736 allocations: 68.19 GiB) ADAM 14.158 s (12608940 allocations: 24.47 GiB) numeric BFGS 945.420 ms (1723964 allocations: 692.32 MiB) ADAM 334.701 ms (629516 allocations: 240.37 MiB)
I think it is eventual to speed up significantly AD runtime, for example, https://juliadiff.org/ForwardDiff.jl/v0.10.2/user/advanced.html#Configuring-Chunk-Size-1
Just set the chunk size to 1 to see if there's a type-instability from it.
Kuramoto–Sivashinsky equation looks good with ForwardDiff AD but still too long
BFGS, 300 iterations
I have problem with RuntimeGeneratedFunctions.
using RuntimeGeneratedFunctions, DiffEqFlux, ForwardDiff
phi = FastChain(FastDense(1,12,Flux.tanh),FastDense(12,12,Flux.tanh),FastDense(12,1))
indvars = [:x]
undv = [1]
derivative_expr = parser_derivative(phi,indvars,undv)
derivative_expr =:((θ_, x)->begin
#= none:3 =#
ForwardDiff.derivative((x->begin
#= none:8 =#
(((x, θ_)->begin
#= none:3 =#
((FastChain{Tuple{FastDense{typeof(tanh), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(tanh), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}, FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}}}((FastDense{typeof(tanh), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}(12, 1, tanh, DiffEqFlux.var"#initial_params#90"{Vector{Float32}}(Float32[0.34942162, -0.6759936, 0.31213823, -0.6459028, -0.27329746, -0.49197906, 0.6756018, 0.5688718, -0.36240864, -0.6780105 … 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), true), FastDense{typeof(tanh), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}(12, 12, tanh, DiffEqFlux.var"#initial_params#90"{Vector{Float32}}(Float32[0.19952452, 0.003932476, -0.16829717, -0.30636954, 0.11300719, 0.09094751, 0.29692435, 0.15857887, -0.29655933, -0.0685693 … 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), true), FastDense{typeof(identity), DiffEqFlux.var"#initial_params#90"{Vector{Float32}}}(1, 12, identity, DiffEqFlux.var"#initial_params#90"{Vector{Float32}}(Float32[-0.5841597, -0.3415628, -0.17458333, -0.11816714, -0.3745709, 0.3589071, -0.2814319, -0.41690788, 0.6444232, 0.0038750547, 0.21813866, 0.6468031, 0.0]), true))))(vcat(x), θ_))[1]
end))(x, θ_)
end), x)
end)
RuntimeGeneratedFunctions.init(@__MODULE__)
der2 = @RuntimeGeneratedFunction(derivative_expr)
julia> der2.(Ref(initθ),rand(1,10))
ERROR: The function body AST defined by this @generated function is not pure. This likely means it contains a closure or comprehension.
Stacktrace:
[1] RuntimeGeneratedFunction
@ ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:117 [inlined]
[2] _broadcast_getindex_evalf
@ ./broadcast.jl:648 [inlined]
[3] _broadcast_getindex
@ ./broadcast.jl:621 [inlined]
[4] getindex
@ ./broadcast.jl:575 [inlined]
[5] copy
@ ./broadcast.jl:922 [inlined]
[6] materialize(bc::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2}, Nothing, RuntimeGeneratedFunction{(:θ_, :x), var"#_RGF_ModTag", var"#_RGF_ModTag", (0x44ae534c, 0x7029fbab, 0x55027912, 0x1acfec15, 0x18543f01)}, Tuple{Base.RefValue{Vector{Float64}}, Matrix{Float64}}})
@ Base.Broadcast ./broadcast.jl:883
[7] top-level scope
@ none:1
I got that same error message when trying to implement the Integral with Interval(x, Inf).
if a isa Num
a = NeuralPDE.build_symbolic_loss_function(nothing, indvars,depvars,
dict_indvars,dict_depvars,
dict_depvar_input, phi, derivative_,
nothing, chain, initθ, strategy,
integrand = a, integrating_depvars=integrating_depvars,
param_estim =false, default_p = nothing)
...
I still can't figure out how to use the @RuntimeGeneratedFunction, here one way or another it needs to call an anonymous function inside the function body. Which leads to an error with RuntimeGeneratedFunction @ChrisRackauckas do you have ideas?
using ForwardDiff, Flux, DiffEqFlux, RuntimeGeneratedFunction
RuntimeGeneratedFunctions.init(@__MODULE__)
phi = FastChain(FastDense(2,12,Flux.tanh),FastDense(12,12,Flux.tanh),FastDense(12,1))
initθ = Float64.(DiffEqFlux.initial_params(phi))
u = (x,y,θ_) -> phi(vcat(x,y),θ_)[1]
z = function (x,y, θ)
x -> u(x,y, θ)
end
x=1
y=1
z(x,y, initθ)
ForwardDiff.derivative(z(x,y, initθ),x)
func_der = (x,y, θ) -> ForwardDiff.derivative(x -> u(x,y, θ),x)
func_der(x,y, initθ)
zzz = :(function (x,y, θ)
x -> u(x,y, θ)
end)
zz_ = @RuntimeGeneratedFunction(zzz)
julia> zz_(1,1,initθ)(1)
ERROR: The function body AST defined by this @generated function is not pure. This likely means it contains a closure or comprehension.
Stacktrace:
[1] (::RuntimeGeneratedFunction{(:x, :y, :θ), var"#_RGF_ModTag", var"#_RGF_ModTag", (0xc616339c, 0xb34a9fc4, 0xa68d2e38, 0x274a155f, 0x6f509da5)})(::Int64, ::Int64, ::Vector{Float64})
@ RuntimeGeneratedFunctions ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:117
[2] top-level scope
@ none:1
zz_ = @eval $zzz
zz_(1,1,initθ)(1)
func_der = (x,y, θ) -> ForwardDiff.derivative(zz_(x,y,initθ),x)
func_der(1,1,initθ)
someexpr = :((x,y,θ) -> ForwardDiff.derivative($zz_(x,y,θ),x))
zd = @RuntimeGeneratedFunction(someexpr)
zd(1,2,initθ)
someexpr2 = :((x,y,θ) -> ForwardDiff.derivative(x -> u(x,y, θ),x))
zd2= @RuntimeGeneratedFunction(someexpr2)
zd2(1,2,initθ)
ERROR: The function body AST defined by this @generated function is not pure. This likely means it contains a closure or comprehension.
Stacktrace:
[1] (::RuntimeGeneratedFunction{(:x, :y, :θ), var"#_RGF_ModTag", var"#_RGF_ModTag", (0x00b2cc5f, 0xddfe1493, 0x62e6a275, 0x5ca8b534, 0x57f11384)})(::Int64, ::Int64, ::Vector{Float64})
@ RuntimeGeneratedFunctions ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:117
[2] top-level scope
@ none:1
Derivative benchmarks.
chain = FastChain(FastDense(2,16,Flux.σ),FastDense(16,16,Flux.σ),FastDense(16,1))
initθ = Float64.(DiffEqFlux.initial_params(chain))
eltypeθ = eltype(initθ)
parameterless_type_θ = DiffEqBase.parameterless_type(initθ)
phi = NeuralPDE.get_phi(chain,parameterless_type_θ)
u_ = (cord, θ, phi)->phi(cord, θ)
_epsilon = one(eltype(initθ)) / (2*cbrt(eps(eltype(initθ))))
ε = cbrt(eps(eltype(initθ)))
function get_ε(dim, der_num,eltypeθ)
epsilon = cbrt(eps(eltypeθ))
ε = zeros(eltypeθ, dim)
ε[der_num] = epsilon
ε
end
eps_x = get_ε(2, 1,Float64)
eps_y = get_ε(2, 2,Float64)
r =rand(1,10)
#First order derivative
#ad
dphi_xa = eval(NeuralPDE.parser_derivative(phi,[:x,:y],[1]))
@benchmark dphi_xa.(Ref(initθ),r,r) setup=(r=rand(1,100))
BenchmarkTools.Trial: 7166 samples with 1 evaluation.
Range (min … max): 288.637 μs … 189.844 ms ┊ GC (min … max): 0.00% … 99.73%
Time (median): 383.253 μs ┊ GC (median): 0.00%
Time (mean ± σ): 707.494 μs ± 6.754 ms ┊ GC (mean ± σ): 35.62% ± 3.72%
█▅
▁▂▄▆▇▆▇██▅▄▄▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▃▃▃▃▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
289 μs Histogram: frequency by time 955 μs <
Memory estimate: 1.08 MiB, allocs estimate: 3214.
#numeric
@benchmark derivative(phi,u_,r,[eps_x],1,initθ) setup=(r=rand(2,100))
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 55.242 μs … 108.085 ms ┊ GC (min … max): 0.00% … 99.85%
Time (median): 72.666 μs ┊ GC (median): 0.00%
Time (mean ± σ): 101.174 μs ± 1.081 ms ┊ GC (mean ± σ): 10.67% ± 1.00%
▃▂▅▆▄█▇▃▂▁ ▁▁▁▂▄▅▅▄▃▃▂▁▁▁▁▁▂▁ ▂
█████████████████████████████████████▇█▇▇▇▇▇▆▆▆▇▆▇▆▅▆▆▅▆▅▆▅▆▅ █
55.2 μs Histogram: log(frequency) by time 225 μs <
Memory estimate: 126.77 KiB, allocs estimate: 60.
## aa second order ad
dphi_xxaa = eval(NeuralPDE.parser_derivative(phi,[:x,:y],[1,1]))
## na second order numeric + ad
dphi_xn =(x,θ) -> (phi(x .+ eps_x, θ) .- phi(x .- eps_x, θ)) .* _epsilon
dphi_xn([1,2],initθ)
dphi_xxna = eval(NeuralPDE.parser_derivative(dphi_xn,[:x,:y],[1]))
## an second order ad + numeric
dphi_xa = eval(NeuralPDE.parser_derivative(phi,[:x,:y],[1]))
dphi_xxan =(θ,x,y) -> (dphi_xa.(Ref(θ),x .+ε,y) .- dphi_xa.(Ref(θ),x .-ε,y)) .* _epsilon
## nn, second order numeric
derivative = NeuralPDE.get_numeric_derivative()
dphi_xxnn = (θ,cord) -> derivative(phi,u_,cord,[eps_x,eps_x],2,θ)
phi_ = (p) -> phi(p, initθ)[1]
hess_phi = Zygote.hessian(phi_,[1,2])
hess_phi[1]
dphi_xxaa(initθ,1,2)
dphi_xxna(initθ,1,2)
dphi_xxan(initθ,1,2)
dphi_xxnn(initθ,[1,2])
@test isapprox(hess_phi[1], dphi_xxaa(initθ,1,2), atol=1e-12)
@test isapprox(hess_phi[1], dphi_xxna(initθ,1,2), atol=1e-12)
@test isapprox(hess_phi[1], dphi_xxan(initθ,1,2), atol=1e-12)
@test isapprox(hess_phi[1], dphi_xxnn(initθ,[1,2])[1], atol=1e-6)
#Second order derivative
#numeric
@benchmark dphi_xxnn(initθ,r) setup=(r=rand(2,100))
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 116.201 μs … 118.110 ms ┊ GC (min … max): 0.00% … 99.72%
Time (median): 151.786 μs ┊ GC (median): 0.00%
Time (mean ± σ): 219.325 μs ± 1.943 ms ┊ GC (mean ± σ): 15.30% ± 1.73%
▄▅▅██▅▄▃▂▂▃▃▂▁▁▂▄▄▅▅▅▄▃▃▃▂▂▂▂▂▁▁▁▁▁ ▁ ▂
███████████████████████████████████████████▇███▇▆▇▆▇▇▆▇▆▇▆▆▆▇ █
116 μs Histogram: log(frequency) by time 400 μs <
Memory estimate: 257.11 KiB, allocs estimate: 126.
#ad
@benchmark dphi_xxaa.(Ref(initθ),r,r) setup=(r=rand(1,100))
BenchmarkTools.Trial: 6281 samples with 1 evaluation.
Range (min … max): 392.570 μs … 175.002 ms ┊ GC (min … max): 0.00% … 99.54%
Time (median): 492.478 μs ┊ GC (median): 0.00%
Time (mean ± σ): 780.928 μs ± 6.247 ms ┊ GC (mean ± σ): 30.24% ± 3.77%
█▇▁
▁▃▅▇▇▆▄▃███▆▄▄▃▄▃▃▂▂▂▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
393 μs Histogram: frequency by time 995 μs <
Memory estimate: 1.20 MiB, allocs estimate: 3314.
#ad + numeric
@benchmark dphi_xxan.(Ref(initθ),r,r) setup=(r=rand(1,100))
BenchmarkTools.Trial: 2375 samples with 1 evaluation.
Range (min … max): 1.210 ms … 185.799 ms ┊ GC (min … max): 0.00% … 98.84%
Time (median): 1.525 ms ┊ GC (median): 0.00%
Time (mean ± σ): 2.085 ms ± 8.590 ms ┊ GC (mean ± σ): 20.59% ± 4.95%
▁ ▃██▂▁
▂▃▃▄█▇▇██████▇▅▅▅▄▄▄▄▃▃▃▃▃▃▃▄▃▄▃▄▃▃▃▃▃▃▃▃▂▂▃▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂ ▃
1.21 ms Histogram: frequency by time 2.89 ms <
Memory estimate: 2.20 MiB, allocs estimate: 8214.
#numeric + ad
@benchmark dphi_xxna.(Ref(initθ),r,r) setup=(r=rand(1,100))
BenchmarkTools.Trial: 2835 samples with 1 evaluation.
Range (min … max): 870.898 μs … 187.949 ms ┊ GC (min … max): 0.00% … 99.18%
Time (median): 1.152 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.775 ms ± 9.307 ms ┊ GC (mean ± σ): 27.79% ± 5.26%
▃█▅
▁▂▃▅█▇▆█████▆▅▄▃▃▄▃▄▃▃▄▃▃▄▃▃▃▄▄▃▃▃▂▂▂▂▂▂▂▂▂▂▁▁▂▂▁▁▂▁▁▁▁▁▁▁▁▁▁ ▃
871 μs Histogram: frequency by time 2.32 ms <
Memory estimate: 2.18 MiB, allocs estimate: 6814.
#backprop
#ForwardDiff
#numeric
r=rand(2,100)
@benchmark ForwardDiff.gradient(θ->sum(dphi_xxnn(θ,r)),inθ) setup=(inθ=initθ)
BenchmarkTools.Trial: 66 samples with 1 evaluation.
Range (min … max): 58.578 ms … 155.051 ms ┊ GC (min … max): 0.00% … 54.71%
Time (median): 67.640 ms ┊ GC (median): 0.00%
Time (mean ± σ): 75.757 ms ± 24.971 ms ┊ GC (mean ± σ): 11.30% ± 17.06%
█▁
▄▄▅█████▃▆▁▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▁▁▁▃▃▁▃ ▁
58.6 ms Histogram: frequency by time 153 ms <
Memory estimate: 88.87 MiB, allocs estimate: 6559.
r=rand(1,100)
#ad
@benchmark ForwardDiff.gradient(θ->sum(dphi_xxaa.(Ref(θ),r,r)),inθ) setup=(inθ=initθ)
BenchmarkTools.Trial: 31 samples with 1 evaluation.
Range (min … max): 74.627 ms … 304.291 ms ┊ GC (min … max): 0.00% … 48.78%
Time (median): 115.402 ms ┊ GC (median): 0.00%
Time (mean ± σ): 162.370 ms ± 82.649 ms ┊ GC (mean ± σ): 35.01% ± 26.95%
█▄▁ ▁ ▁ █
███▁▁█▁▆█▆▆▁▁▁▁▆▁▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆▆▁▁▁▁▆▆▁█▆▆▁▁▆▁▆▁▁▁▁▁▆ ▁
74.6 ms Histogram: frequency by time 304 ms <
Memory estimate: 368.69 MiB, allocs estimate: 104840.
#ad + numeric
@benchmark ForwardDiff.gradient(θ->sum(dphi_xxan.(Ref(θ),r,r)),inθ) setup=(inθ=initθ)
BenchmarkTools.Trial: 18 samples with 1 evaluation.
Range (min … max): 148.058 ms … 388.769 ms ┊ GC (min … max): 0.00% … 43.44%
Time (median): 315.969 ms ┊ GC (median): 44.01%
Time (mean ± σ): 291.624 ms ± 77.209 ms ┊ GC (mean ± σ): 39.03% ± 19.03%
▃ █ ▃▃
▇▁█▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇▁▇▇█▁▁██▇▁▁▇▁▁▇▁▁▁▁▁▁▇ ▁
148 ms Histogram: frequency by time 389 ms <
Memory estimate: 660.88 MiB, allocs estimate: 264340.
#numeric + ad
@benchmark ForwardDiff.gradient(θ->sum(dphi_xxna.(Ref(θ),r,r)),inθ) setup=(inθ=initθ)
BenchmarkTools.Trial: 18 samples with 1 evaluation.
Range (min … max): 138.084 ms … 378.683 ms ┊ GC (min … max): 0.00% … 42.60%
Time (median): 321.987 ms ┊ GC (median): 44.54%
Time (mean ± σ): 295.129 ms ± 81.405 ms ┊ GC (mean ± σ): 38.62% ± 18.68%
▁ ▁█ ▁
█▁▁▁▆▁▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁██▁▁▆▆█▆▁▆▁▆▁▁▁▆ ▁
138 ms Histogram: frequency by time 379 ms <
Memory estimate: 658.75 MiB, allocs estimate: 215040.
#Zygote
r=rand(2,100)
#numeric
@benchmark Zygote.gradient(θ->sum(dphi_xxnn(θ,r)),inθ) setup=(inθ=initθ)
BenchmarkTools.Trial: 5587 samples with 1 evaluation.
Range (min … max): 643.525 μs … 129.094 ms ┊ GC (min … max): 0.00% … 98.66%
Time (median): 727.925 μs ┊ GC (median): 0.00%
Time (mean ± σ): 890.099 μs ± 3.607 ms ┊ GC (mean ± σ): 12.06% ± 2.96%
█▃
▁▂▂▃▅██▆▄▃▃▃▄▅▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
644 μs Histogram: frequency by time 1.26 ms <
Memory estimate: 880.38 KiB, allocs estimate: 1678.
r=rand(1,100)
#ad
@benchmark Zygote.gradient(θ->sum(dphi_xxaa.(Ref(θ),r,r)),inθ) setup=(inθ=initθ)
BenchmarkTools.Trial: 4842 samples with 1 evaluation.
Range (min … max): 650.008 μs … 163.139 ms ┊ GC (min … max): 0.00% … 99.42%
Time (median): 740.486 μs ┊ GC (median): 0.00%
Time (mean ± σ): 1.029 ms ± 6.268 ms ┊ GC (mean ± σ): 24.73% ± 4.04%
▄██▂
▃▄▅▆▇██████▆▅▅▄▅▆▅▄▄▃▄▄▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂ ▃
650 μs Histogram: frequency by time 1.28 ms <
Memory estimate: 1.52 MiB, allocs estimate: 5322.
#ad + numeric
@benchmark Zygote.gradient(θ->sum(dphi_xxan.(Ref(θ),r,r)),inθ) setup=(inθ=initθ)
BenchmarkTools.Trial: 1439 samples with 1 evaluation.
Range (min … max): 2.571 ms … 160.798 ms ┊ GC (min … max): 0.00% … 97.86%
Time (median): 2.876 ms ┊ GC (median): 0.00%
Time (mean ± σ): 3.469 ms ± 9.076 ms ┊ GC (mean ± σ): 15.41% ± 5.77%
▇▇█▇▅▂▅▄▁▂▂▄▃▃▁
▂▃▆████████████████▇▆▄▄▃▄▄▃▃▄▃▃▃▃▃▂▂▂▂▁▂▂▃▁▂▃▂▂▂▂▁▂▁▂▂▂▂▂▂▂ ▄
2.57 ms Histogram: frequency by time 4.16 ms <
Memory estimate: 3.04 MiB, allocs estimate: 21631.
#numeric + ad
@benchmark Zygote.gradient(θ->sum(dphi_xxna.(Ref(θ),r,r)),inθ) setup=(inθ=initθ)
BenchmarkTools.Trial: 2761 samples with 1 evaluation.
Range (min … max): 991.767 μs … 190.764 ms ┊ GC (min … max): 0.00% … 98.95%
Time (median): 1.248 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.804 ms ± 9.277 ms ┊ GC (mean ± σ): 27.59% ± 5.33%
▇▅▇▄▃ ▁▂▆█▅
▂▂▃▅▄█████████████▆▅▄▃▃▄▃▃▃▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▁▂▁▁▁ ▃
992 μs Histogram: frequency by time 2.14 ms <
Memory estimate: 2.51 MiB, allocs estimate: 8822.
Poisson equation benchmarks
@parameters x y
@variables u(..)
Dxx = Differential(x)^2
Dyy = Differential(y)^2
eq = Dxx(u(x,y)) + Dyy(u(x,y)) ~ -sin(pi*x)*sin(pi*y)
bcs = [u(0,y) ~ 0.0, u(1,y) ~ -sin(pi*1)*sin(pi*y),
u(x,0) ~ 0.0, u(x,1) ~ -sin(pi*x)*sin(pi*1)]
domains = [x ∈ Interval(0.0,1.0),
y ∈ Interval(0.0,1.0)]
fastchain = FastChain(FastDense(2,12,Flux.σ),FastDense(12,12,Flux.σ),FastDense(12,1))
initθ = Float64.(DiffEqFlux.initial_params(fastchain))
grid_strategy = NeuralPDE.GridTraining(0.1)
discretization = NeuralPDE.PhysicsInformedNN(fastchain,
grid_strategy;
init_params = initθ,
AD =false)
@named pde_system = PDESystem(eq,bcs,domains,[x,y],[u(x, y)])
prob_numeric = NeuralPDE.discretize(pde_system,discretization)
discretization = NeuralPDE.PhysicsInformedNN(fastchain,
grid_strategy;
init_params = initθ,
AD =true)
prob_AD = NeuralPDE.discretize(pde_system,discretization)
@btime res = GalacticOptim.solve(prob_numeric, ADAM(0.1); maxiters=500);
5.896 s (3143007 allocations: 1.17 GiB)
@btime res = GalacticOptim.solve(prob_numeric, BFGS(); maxiters=500);
25.525 s (15823593 allocations: 6.05 GiB)
@btime res = GalacticOptim.solve(prob_AD, ADAM(0.1); maxiters=500);
60.659 s (55159507 allocations: 121.93 GiB)
@btime res = GalacticOptim.solve(prob_AD, BFGS(); maxiters=500);
199.556 s (142649297 allocations: 315.47 GiB)
dependency runtime of the size of the neural network
num = ...
fastchain = FastChain(FastDense(2,num,Flux.σ),FastDense(num,num,Flux.σ),FastDense(num,1))
initθ = Float64.(DiffEqFlux.initial_params(fastchain))
grid_strategy = NeuralPDE.GridTraining(0.1)
quasirandom_strategy = NeuralPDE.QuasiRandomTraining(50;
bcs_points= 10,
sampling_alg = LatinHypercubeSample())
discretization = NeuralPDE.PhysicsInformedNN(fastchain,
grid_strategy;
init_params = initθ,
AD =false)
@named pde_system = PDESystem(eq,bcs,domains,[x,y],[u(x, y)])
prob_numeric = NeuralPDE.discretize(pde_system,discretization)
discretization = NeuralPDE.PhysicsInformedNN(fastchain,
grid_strategy;
init_params = initθ,
AD =true)
prob_AD = NeuralPDE.discretize(pde_system,discretization)
#num =4
@btime GalacticOptim.solve(prob_numeric, ADAM(0.1); maxiters=10);
14.994 ms (62886 allocations: 15.96 MiB)
@btime GalacticOptim.solve(prob_numeric, BFGS(); maxiters=10);
48.262 ms (169923 allocations: 43.27 MiB)
@btime GalacticOptim.solve(prob_AD, ADAM(0.1); maxiters=10);
48.977 ms (282046 allocations: 148.32 MiB)
@btime res_ad = GalacticOptim.solve(prob_AD, BFGS(); maxiters=10);
140.144 ms (761654 allocations: 400.64 MiB)
#num =8
@btime GalacticOptim.solve(prob_numeric, ADAM(0.1); maxiters=10);
19.958 ms (62885 allocations: 19.77 MiB)
@btime GalacticOptim.solve(prob_numeric, BFGS(); maxiters=10);
74.776 ms (232741 allocations: 74.36 MiB)
@btime GalacticOptim.solve(prob_AD, ADAM(0.1); maxiters=10);
324.239 ms (564645 allocations: 722.51 MiB)
@btime res_ad = GalacticOptim.solve(prob_AD, BFGS(); maxiters=10);
1.384 s (2089252 allocations: 2.61 GiB)
# num = 12
@btime GalacticOptim.solve(prob_numeric, ADAM(0.1); maxiters=10);
21.025 ms (62884 allocations: 24.03 MiB)
@btime GalacticOptim.solve(prob_numeric, BFGS(); maxiters=10);
68.583 ms (182490 allocations: 74.26 MiB)
@btime GalacticOptim.solve(prob_AD, ADAM(0.1); maxiters=10);
1.253 s (1103214 allocations: 2.44 GiB)
@btime res_ad = GalacticOptim.solve(prob_AD, BFGS(); maxiters=10);
3.476 s (3089135 allocations: 6.83 GiB)
# num = 16
@btime GalacticOptim.solve(prob_numeric, ADAM(0.1); maxiters=10);
22.563 ms (62884 allocations: 28.65 MiB)
@btime GalacticOptim.solve(prob_numeric, BFGS(); maxiters=10);
83.655 ms (188771 allocations: 98.19 MiB)
@btime GalacticOptim.solve(prob_AD, ADAM(0.1); maxiters=10);
3.008 s (1844114 allocations: 6.05 GiB)
@btime res_ad = GalacticOptim.solve(prob_AD, BFGS(); maxiters=10);
9.464 s (5532460 allocations: 18.18 GiB)
Poisson's equation prediction accuracy automatic / numerical differentiation
num = 5
fastchain = FastChain(FastDense(2,num,Flux.σ),FastDense(num,num,Flux.σ),FastDense(num,1))
initθ = Float64.(DiffEqFlux.initial_params(fastchain))
grid_strategy = NeuralPDE.GridTraining(0.05)
discretization = NeuralPDE.PhysicsInformedNN(fastchain,
grid_strategy;
init_params = initθ,
AD =false)
@named pde_system = PDESystem(eq,bcs,domains,[x,y],[u(x, y)])
prob_numeric = NeuralPDE.discretize(pde_system,discretization)
discretization = NeuralPDE.PhysicsInformedNN(fastchain,
grid_strategy;
init_params = initθ,
AD =true)
prob_AD = NeuralPDE.discretize(pde_system,discretization)
res_num = GalacticOptim.solve(prob_numeric, BFGS();cb=cb, maxiters=2000)
res_ad = GalacticOptim.solve(prob_AD, BFGS();cb=cb, maxiters=2000)
phi = discretization.phi
xs,ys = [infimum(d.domain):0.01:supremum(d.domain) for d in domains]
analytic_sol_func(x,y) = (sin(pi*x)*sin(pi*y))/(2pi^2)
u_predict_num = reshape([first(phi([x,y],res_num.minimizer)) for x in xs for y in ys],(length(xs),length(ys)))
u_predict_ad = reshape([first(phi([x,y],res_ad.minimizer)) for x in xs for y in ys],(length(xs),length(ys)))
u_real = reshape([analytic_sol_func(x,y) for x in xs for y in ys], (length(xs),length(ys)))
diff_u_num = abs.(u_predict_num .- u_real)
diff_u_ad = abs.(u_predict_ad .- u_real)
@test u_predict_num ≈ u_real rtol = 0.02
@test u_predict_ad ≈ u_real rtol = 0.002
p1 = plot(xs, ys, u_real, linetype=:contourf,title = "analytic");
p2 = plot(xs, ys, u_predict_ad, linetype=:contourf,title = "predict_ad");
p3 = plot(xs, ys, u_predict_num, linetype=:contourf,title = "predict_num");
p4 = plot(xs, ys,diff_u_ad,linetype=:contourf,title = "error_ad");
p5 = plot(xs, ys, diff_u_num,linetype=:contourf,title = "error_num");
plot(p1,p2,p3)
plot(p4,p5)
using Statistics
julia> mean(diff_u_num)
0.0003909686701947168
julia> mean(diff_u_ad)
2.3798038490057497e-5
julia> std(diff_u_num)
0.0003120935609823069
julia> std(diff_u_ad)
1.9645709644550055e-5
@btime res_num = GalacticOptim.solve(prob_numeric, BFGS(); maxiters=2000)
13.649 s (24797861 allocations: 13.01 GiB)
@btime res_ad = GalacticOptim.solve(prob_AD, BFGS(); maxiters=2000)
314.281 s (734773650 allocations: 501.37 GiB)
problem with opaque_closure
julia> VERSION
v"1.7.1"
chain = FastChain(FastDense(1,16,Flux.σ),FastDense(16,16,Flux.σ),FastDense(16,1))
initθ = Float64.(DiffEqFlux.initial_params(chain))
phi_ = NeuralPDE.get_phi(chain,DiffEqBase.parameterless_type(initθ))
u_(x,θ_) = phi_(vcat(x),θ_)[1]
u_(1, initθ)
inner_expr =:(x->$u_(x,θ_))
inner_func = @RuntimeGeneratedFunction(inner_expr)
expr = :((x,θ_)->ForwardDiff.derivative($inner_expr,x))
func = @RuntimeGeneratedFunction(expr)
RuntimeGeneratedFunction(#=in Main=#, #=using Main=#, :((x, θ_)->begin
#= none:1 =#
ForwardDiff.derivative($(Expr(:opaque_closure, :(x->begin
#= none:1 =#
(u_)(x, θ_)
end))), x)
end))
func(1,initθ)
ForwardDiff.derivative(x->func(x,initθ),1)
ForwardDiff.gradient(initθ->func(1,initθ),initθ)
ERROR: Cannot determine ordering of Dual tags ForwardDiff.Tag{Core.OpaqueClosure{Tuple{Any}, Any}, Int64} and ForwardDiff.Tag{var"#174#175", Float64}
Stacktrace:
[1] partials
@ ~/.julia/packages/ForwardDiff/CkdHU/src/dual.jl:111 [inlined]
[2] extract_derivative(#unused#::Type{ForwardDiff.Tag{Core.OpaqueClosure{Tuple{Any}, Any}, Int64}}, y::ForwardDiff.Dual{ForwardDiff.Tag{var"#174#175", Float64}, ForwardDiff.Dual{ForwardDiff.Tag{Core.OpaqueClosure{Tuple{Any}, Any}, Int64}, Float64, 1}, 12})
@ ForwardDiff ~/.julia/packages/ForwardDiff/CkdHU/src/derivative.jl:81
[3] derivative(f::Core.OpaqueClosure{Tuple{Any}, Any}, x::Int64)
@ ForwardDiff ~/.julia/packages/ForwardDiff/CkdHU/src/derivative.jl:14
[4] macro expansion
@ ./none:1 [inlined]
[5] macro expansion
@ ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:129 [inlined]
[6] macro expansion
@ ./none:0 [inlined]
[7] generated_callfunc
@ ./none:0 [inlined]
[8] (::RuntimeGeneratedFunction{(:x, :θ_), var"#_RGF_ModTag", var"#_RGF_ModTag", (0x3adef619, 0x6cabbff9, 0x73bfdf49, 0x6187b365, 0xa07ef6b1)})(::Int64, ::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#174#175", Float64}, Float64, 12}})
@ RuntimeGeneratedFunctions ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:117
[9] (::var"#174#175")(initθ::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#174#175", Float64}, Float64, 12}})
@ Main ./none:1
[10] chunk_mode_gradient(f::var"#174#175", x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#174#175", Float64}, Float64, 12, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#174#175", Float64}, Float64, 12}}})
@ ForwardDiff ~/.julia/packages/ForwardDiff/CkdHU/src/gradient.jl:150
[11] gradient(f::Function, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#174#175", Float64}, Float64, 12, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#174#175", Float64}, Float64, 12}}}, ::Val{true})
@ ForwardDiff ~/.julia/packages/ForwardDiff/CkdHU/src/gradient.jl:21
[12] gradient(f::Function, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#174#175", Float64}, Float64, 12, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#174#175", Float64}, Float64, 12}}}) (repeats 2 times)
@ ForwardDiff ~/.julia/packages/ForwardDiff/CkdHU/src/gradient.jl:17
[13] top-level scope
@ none:1
@ChrisRackauckas do you have any ideas on how to deal with this?
Add Dual and Tags for Core.OpaqueClosure?
Did you try nesting RGFs?
yes, it is the same error
using RuntimeGeneratedFunctions
RuntimeGeneratedFunctions.init(@__MODULE__)
chain = FastChain(FastDense(1,16,Flux.σ),FastDense(16,16,Flux.σ),FastDense(16,1))
initθ = Float64.(DiffEqFlux.initial_params(chain))
phi_ = NeuralPDE.get_phi(chain,DiffEqBase.parameterless_type(initθ))
u_(x,θ_) = phi_(vcat(x),θ_)[1]
u_(1, initθ)
inner_expr =:(θ->(x-> $u_(x,θ))) #nested
inner_func = @RuntimeGeneratedFunction(inner_expr)
# inner_func = eval(inner_expr)
inner_func(initθ)(1)
expr = :((x,θ)->ForwardDiff.derivative($inner_func(θ),x))
func = @RuntimeGeneratedFunction(expr)
# func = eval(expr)
func(1,initθ)
RuntimeGeneratedFunction(#=in Main=#, #=using Main=#, :((x, θ)->begin
#= none:1 =#
ForwardDiff.derivative((RuntimeGeneratedFunction{(:θ,), var"#_RGF_ModTag", var"#_RGF_ModTag", (0xd45d54ce, 0x11696e8f, 0x9d485e30, 0x8cfdc8bb, 0xdb9be5d1)}(quote
#= none:3 =#
$(Expr(:opaque_closure, :(x->begin
#= none:3 =#
(u_)(x, θ)
end)))
end))(θ), x)
end))
func.(rand(1,10),Ref(initθ))
ForwardDiff.gradient(x->sum(func.(x,Ref(initθ))),rand(1,10))
ForwardDiff.derivative(x->func(x,initθ),1)
ForwardDiff.gradient(initθ->func(1,initθ),initθ)
ERROR: Cannot determine ordering of Dual tags ForwardDiff.Tag{Core.OpaqueClosure{Tuple{Any}, Any}, Int64} and ForwardDiff.Tag{var"#176#177", Float64}
Stacktrace:
[1] partials
@ ~/.julia/packages/ForwardDiff/CkdHU/src/dual.jl:111 [inlined]
[2] extract_derivative(#unused#::Type{ForwardDiff.Tag{Core.OpaqueClosure{Tuple{Any}, Any}, Int64}}, y::ForwardDiff.Dual{ForwardDiff.Tag{var"#176#177", Float64}, ForwardDiff.Dual{ForwardDiff.Tag{Core.OpaqueClosure{Tuple{Any}, Any}, Int64}, Float64, 1}, 12})
@ ForwardDiff ~/.julia/packages/ForwardDiff/CkdHU/src/derivative.jl:81
[3] derivative(f::Core.OpaqueClosure{Tuple{Any}, Any}, x::Int64)
@ ForwardDiff ~/.julia/packages/ForwardDiff/CkdHU/src/derivative.jl:14
[4] macro expansion
@ ./none:1 [inlined]
[5] macro expansion
@ ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:129 [inlined]
[6] macro expansion
@ ./none:0 [inlined]
[7] generated_callfunc
@ ./none:0 [inlined]
[8] (::RuntimeGeneratedFunction{(:x, :θ), var"#_RGF_ModTag", var"#_RGF_ModTag", (0x65de165a, 0x715b7b1c, 0x390e3fc0, 0x84edfe71, 0xaf23b923)})(::Int64, ::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#176#177", Float64}, Float64, 12}})
@ RuntimeGeneratedFunctions ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:117
[9] (::var"#176#177")(initθ::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#176#177", Float64}, Float64, 12}})
@ Main ./none:1
[10] chunk_mode_gradient(f::var"#176#177", x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#176#177", Float64}, Float64, 12, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#176#177", Float64}, Float64, 12}}})
@ ForwardDiff ~/.julia/packages/ForwardDiff/CkdHU/src/gradient.jl:150
[11] gradient(f::Function, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#176#177", Float64}, Float64, 12, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#176#177", Float64}, Float64, 12}}}, ::Val{true})
@ ForwardDiff ~/.julia/packages/ForwardDiff/CkdHU/src/gradient.jl:21
[12] gradient(f::Function, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#176#177", Float64}, Float64, 12, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#176#177", Float64}, Float64, 12}}}) (repeats 2 times)
@ ForwardDiff ~/.julia/packages/ForwardDiff/CkdHU/src/gradient.jl:17
[13] top-level scope
@ none:1
https://www.stochasticlifestyle.com/improved-forwarddiff-jl-stacktraces-with-package-tags/ should help