ModelingToolkitNeuralNets.jl icon indicating copy to clipboard operation
ModelingToolkitNeuralNets.jl copied to clipboard

Using NeuralNetworkBlock results in an ODESystem with mass matrix

Open hstrey opened this issue 1 year ago • 6 comments

When running the friction example, I noticed that after:

@named nn = NeuralNetworkBlock(1, 1; chain = chain2, rng = StableRNG(1111))

eqs = [connect(model.nn_in, nn.output)
       connect(model.nn_out, nn.input)]

ude_sys = complete(ODESystem(eqs, t, systems = [model, nn], name = :ude_sys))
sys = structural_simplify(ude_sys)

the resulting system has a mass matrix and the unknowns are:

unknowns(sys)
2-element Vector{SymbolicUtils.BasicSymbolic{Real}}:
 friction₊y(t)
 (nn₊input₊u(t))[1]

I checked a little deeper and noticed that nn₊input₊u has irreducible set in its metadata.

I suggest not making this a default behavior since a system with a mass matrix restricts the set of solvers that can be used for the problem. One could create an option in the NeuralNetworkBlock to make the input irreducible.

hstrey avatar Jun 07 '24 13:06 hstrey

This seems to happen due to function registration. Consider the following:

using ModelingToolkitStandardLibrary.Blocks
@named out_inside = RealOutputArray(nout=2)
@named out_outside = RealOutputArray(nout=2)
@named in_inside = RealInputArray(nin=2)
@named in_outside = RealInputArray(nin=2)
@variables x(t) = 0

f(x) = x .* x
f2(x) = x .* x

@register_array_symbolic f(x::AbstractVector) begin
    size=(2,)
end

eqs = [
    D(x) ~ x + in_outside.u[1],
    out_outside.u[1] ~ x,
    out_outside.u[2] ~ x+1,
    out_inside.u ~ f2(in_inside.u),
    connect(out_inside, in_outside),
    connect(in_inside, out_outside)
]

@named sys = ODESystem(eqs, t)
ss = structural_simplify(sys)

if you use f2, then MTK can trace through and you get

julia> equations(ss)
1-element Vector{Equation}:
 Differential(t)(x(t)) ~ (out_inside₊u(t))[1] + x(t)

julia> ModelingToolkit.full_equations(ss)
1-element Vector{Equation}:
 Differential(t)(x(t)) ~ x(t) + x(t)^2

but if you use f (which is registered), then

julia> equations(ss)
3-element Vector{Equation}:
 Differential(t)(x(t)) ~ (out_inside₊u(t))[1] + x(t)
 0 ~ -(in_inside₊u(t))[1] + (out_outside₊u(t))[1]
 0 ~ -(in_inside₊u(t))[2] + (out_outside₊u(t))[2]

julia> ModelingToolkit.full_equations(ss)
3-element Vector{Equation}:
 Differential(t)(x(t)) ~ (f(in_inside₊u(t)))[1] + x(t)
 0 ~ -(in_inside₊u(t))[1] + x(t)
 0 ~ 1 - (in_inside₊u(t))[2] + x(t)

SebastianM-C avatar Jun 18 '24 10:06 SebastianM-C

But I don't see any @register_array_symbolic in my code above. I did not see anything in the NeuralNetworkBlock. Can you clarify and tell me how to avoid this registration when creating a NeuralNetworkBlock.

Thanks

hstrey avatar Jun 23 '24 16:06 hstrey

@register_array_symbolic is in stateless_apply - https://github.com/JuliaSymbolics/Symbolics.jl/blob/master/ext/SymbolicsLuxCoreExt.jl

Unfortunately, I am not sure we can avoid registration in NeuralNetworkBlock

sathvikbhagavan avatar Jun 24 '24 04:06 sathvikbhagavan

Yes, function registration is crucial for this to work.

SebastianM-C avatar Jun 24 '24 21:06 SebastianM-C

As a workaround you can use eqs = [model.nn_in.u~nn.output.u, model.nn_out.u~nn.input.u] instead of eqs = [connect(model.nn_in, nn.output). connect(model.nn_out, nn.input)], but that's not ideal. Hopefully we can get this fixed somehow.

SebastianM-C avatar Mar 26 '25 23:03 SebastianM-C

Weirdly, it's not the connect itself that's the problem. It's the order of the resulting equations.

The system with connect has:

julia> equations(badude)
5-element Vector{Equation}:
 Differential(t)(friction₊y(t)) ~ friction₊Fu - (friction₊nn_in₊u(t))[1]
 friction₊y(t) ~ (friction₊nn_out₊u(t))[1]
 nn₊output₊u(t) ~ LuxCore.stateless_apply(Chain{@NamedTuple{layer_1::Dense{typeof(mish), Int64, Int64, Nothing, Nothing, Static.False}, layer_2::Dense{typeof(mish), Int64, Int64, Nothing, Nothing, Static.False}, layer_3::Dense{typeof(identity), Int64, Int64, Nothing, Nothing, Static.False}}, Nothing}((layer_1 = Dense(1 => 10, mish, use_bias=false), layer_2 = Dense(10 => 10, mish, use_bias=false), layer_3 = Dense(10 => 1, use_bias=false)), nothing), nn₊input₊u(t), convert(nn₊T, nn₊p))
 friction₊nn_in₊u(t) ~ nn₊output₊u(t)
 friction₊nn_out₊u(t) ~ nn₊input₊u(t)

Whereas the one without connect has:

julia> equations(ude)
5-element Vector{Equation}:
 friction₊nn_in₊u(t) ~ nn₊output₊u(t)
 friction₊nn_out₊u(t) ~ nn₊input₊u(t)
 Differential(t)(friction₊y(t)) ~ friction₊Fu - (friction₊nn_in₊u(t))[1]
 friction₊y(t) ~ (friction₊nn_out₊u(t))[1]
 nn₊output₊u(t) ~ LuxCore.stateless_apply(Chain{@NamedTuple{layer_1::Dense{typeof(mish), Int64, Int64, Nothing, Nothing, Static.False}, layer_2::Dense{typeof(mish), Int64, Int64, Nothing, Nothing, Static.False}, layer_3::Dense{typeof(identity), Int64, Int64, Nothing, Nothing, Static.False}}, Nothing}((layer_1 = Dense(1 => 10, mish, use_bias=false), layer_2 = Dense(10 => 10, mish, use_bias=false), layer_3 = Dense(10 => 1, use_bias=false)), nothing), nn₊input₊u(t), convert(nn₊T, nn₊p))

If I rearrange the latter to look like the former, it's a DAE again.

AayushSabharwal avatar Apr 01 '25 06:04 AayushSabharwal