ModelingToolkitNeuralNets.jl
ModelingToolkitNeuralNets.jl copied to clipboard
Using NeuralNetworkBlock results in an ODESystem with mass matrix
When running the friction example, I noticed that after:
@named nn = NeuralNetworkBlock(1, 1; chain = chain2, rng = StableRNG(1111))
eqs = [connect(model.nn_in, nn.output)
connect(model.nn_out, nn.input)]
ude_sys = complete(ODESystem(eqs, t, systems = [model, nn], name = :ude_sys))
sys = structural_simplify(ude_sys)
the resulting system has a mass matrix and the unknowns are:
unknowns(sys)
2-element Vector{SymbolicUtils.BasicSymbolic{Real}}:
friction₊y(t)
(nn₊input₊u(t))[1]
I checked a little deeper and noticed that nn₊input₊u has irreducible set in its metadata.
I suggest not making this a default behavior since a system with a mass matrix restricts the set of solvers that can be used for the problem. One could create an option in the NeuralNetworkBlock to make the input irreducible.
This seems to happen due to function registration. Consider the following:
using ModelingToolkitStandardLibrary.Blocks
@named out_inside = RealOutputArray(nout=2)
@named out_outside = RealOutputArray(nout=2)
@named in_inside = RealInputArray(nin=2)
@named in_outside = RealInputArray(nin=2)
@variables x(t) = 0
f(x) = x .* x
f2(x) = x .* x
@register_array_symbolic f(x::AbstractVector) begin
size=(2,)
end
eqs = [
D(x) ~ x + in_outside.u[1],
out_outside.u[1] ~ x,
out_outside.u[2] ~ x+1,
out_inside.u ~ f2(in_inside.u),
connect(out_inside, in_outside),
connect(in_inside, out_outside)
]
@named sys = ODESystem(eqs, t)
ss = structural_simplify(sys)
if you use f2, then MTK can trace through and you get
julia> equations(ss)
1-element Vector{Equation}:
Differential(t)(x(t)) ~ (out_inside₊u(t))[1] + x(t)
julia> ModelingToolkit.full_equations(ss)
1-element Vector{Equation}:
Differential(t)(x(t)) ~ x(t) + x(t)^2
but if you use f (which is registered), then
julia> equations(ss)
3-element Vector{Equation}:
Differential(t)(x(t)) ~ (out_inside₊u(t))[1] + x(t)
0 ~ -(in_inside₊u(t))[1] + (out_outside₊u(t))[1]
0 ~ -(in_inside₊u(t))[2] + (out_outside₊u(t))[2]
julia> ModelingToolkit.full_equations(ss)
3-element Vector{Equation}:
Differential(t)(x(t)) ~ (f(in_inside₊u(t)))[1] + x(t)
0 ~ -(in_inside₊u(t))[1] + x(t)
0 ~ 1 - (in_inside₊u(t))[2] + x(t)
But I don't see any @register_array_symbolic in my code above. I did not see anything in the NeuralNetworkBlock. Can you clarify and tell me how to avoid this registration when creating a NeuralNetworkBlock.
Thanks
@register_array_symbolic is in stateless_apply - https://github.com/JuliaSymbolics/Symbolics.jl/blob/master/ext/SymbolicsLuxCoreExt.jl
Unfortunately, I am not sure we can avoid registration in NeuralNetworkBlock
Yes, function registration is crucial for this to work.
As a workaround you can use eqs = [model.nn_in.u~nn.output.u, model.nn_out.u~nn.input.u] instead of eqs = [connect(model.nn_in, nn.output). connect(model.nn_out, nn.input)], but that's not ideal. Hopefully we can get this fixed somehow.
Weirdly, it's not the connect itself that's the problem. It's the order of the resulting equations.
The system with connect has:
julia> equations(badude)
5-element Vector{Equation}:
Differential(t)(friction₊y(t)) ~ friction₊Fu - (friction₊nn_in₊u(t))[1]
friction₊y(t) ~ (friction₊nn_out₊u(t))[1]
nn₊output₊u(t) ~ LuxCore.stateless_apply(Chain{@NamedTuple{layer_1::Dense{typeof(mish), Int64, Int64, Nothing, Nothing, Static.False}, layer_2::Dense{typeof(mish), Int64, Int64, Nothing, Nothing, Static.False}, layer_3::Dense{typeof(identity), Int64, Int64, Nothing, Nothing, Static.False}}, Nothing}((layer_1 = Dense(1 => 10, mish, use_bias=false), layer_2 = Dense(10 => 10, mish, use_bias=false), layer_3 = Dense(10 => 1, use_bias=false)), nothing), nn₊input₊u(t), convert(nn₊T, nn₊p))
friction₊nn_in₊u(t) ~ nn₊output₊u(t)
friction₊nn_out₊u(t) ~ nn₊input₊u(t)
Whereas the one without connect has:
julia> equations(ude)
5-element Vector{Equation}:
friction₊nn_in₊u(t) ~ nn₊output₊u(t)
friction₊nn_out₊u(t) ~ nn₊input₊u(t)
Differential(t)(friction₊y(t)) ~ friction₊Fu - (friction₊nn_in₊u(t))[1]
friction₊y(t) ~ (friction₊nn_out₊u(t))[1]
nn₊output₊u(t) ~ LuxCore.stateless_apply(Chain{@NamedTuple{layer_1::Dense{typeof(mish), Int64, Int64, Nothing, Nothing, Static.False}, layer_2::Dense{typeof(mish), Int64, Int64, Nothing, Nothing, Static.False}, layer_3::Dense{typeof(identity), Int64, Int64, Nothing, Nothing, Static.False}}, Nothing}((layer_1 = Dense(1 => 10, mish, use_bias=false), layer_2 = Dense(10 => 10, mish, use_bias=false), layer_3 = Dense(10 => 1, use_bias=false)), nothing), nn₊input₊u(t), convert(nn₊T, nn₊p))
If I rearrange the latter to look like the former, it's a DAE again.