ModelingToolkit.jl icon indicating copy to clipboard operation
ModelingToolkit.jl copied to clipboard

Suggestion: Make `plot` plot unknowns AND observables by default (not just unknowns)

Open TorkelE opened this issue 8 months ago • 8 comments

When splitting what variables become unknowns and what are observables, I feel like it makes sense to lump them together in the plot (i.e. plotting all by default). I.e. here:

using OrdinaryDiffEqDefault, ModelingToolkit, Plots
using ModelingToolkit: t_nounits as t, D_nounits as D

# Create model.
@variables X(t) Y(t)
@parameters p d k1 k2 = 0.5
eqs = [
    D(X) ~ p - d*X,
    k1 + Y ~ k2*X
]
@mtkbuild osys = ODESystem(eqs, t)

# Create `ODEProblem`.
u0 = [X => 2.0]
pvals = [p => 1.0, d => 0.1, k1 => 0.5]
oprob = ODEProblem(osys, u0, (0.0, 10.0), pvals)

# Solve and plot it.
sol = solve(oprob)
plot(sol)

only X is plotted (unless one specify Y explicitly). Now from a user perspective there is no real difference here, they don't know why X is plotted and not Y. I think this is one of (if not the only?) cases where structural_simplify have a direct impact on workflows.

Furthermore, changing this would:

  • Mean that we can be a lot more liberal with updating the internal algorithms of structural_simplify. I.e. there could be minor updates to its algorithm that suddenly changes so that X is eliminated and Y is not (probably not in the example above, but in more complicated ones). If the observables are ploted, the user will never notice or case.
  • The case of eqs = [D(X) ~ -p] is one that can be solved directly, i.e. the only unknown of the system becomes an observable. Here, if you do plot(solve(prob)) you will just get an empty plot plane, which seems really confusing.

I think the other alternative is to just enforcethe use of idxs = in plot. That makes sense for larger models. However, for small models this will be annoying.

TorkelE avatar Apr 10 '25 09:04 TorkelE

Even small models may have 1000s of observables..

baggepinnen avatar Apr 10 '25 09:04 baggepinnen

I mean, this entierly depends on what you are using MTK for right? For people who does DAEs with very few differential equations and many many algebraic equations, yes. But I (and many others) have very few algebraic equations. Furthermore, when we use them it is typically because it is some quantity that we explicitly want to plot. I don't think we should specialise MTK to cater only to one specific field.

TorkelE avatar Apr 10 '25 09:04 TorkelE

Sure but if you try to plot 1000s of signals Julia will crash, so that's not a reasonable default

baggepinnen avatar Apr 10 '25 09:04 baggepinnen

So there are cases where one option makes sense and another doesn't? In Sys bio the main reason for observables is pretty much the opposite (I have 3,000 differential equations, butonly want to plot ~5 observables. So I distinctly specify these things to plot to avoid the crash).

Ultimately, to the user, there isn't really any difference to observables/unknowns, they are all hidden stuff anyway? This feels exactly like the cases when you should use idxs = to specify which quantities of you model you do want to plot.

If our goal is to avoid accidental plotting of too much stuff, having a some heuristic/limit seems like the correct solution right. By default never plot more than 20/100/some other number of quantities, that sounds sensible. Anyone who does this (whether this is due to them having many differential/algebraic equations, or both) are probably making a mistake (and again one can plot all quantities using the idxs = unknowns(sys) and similar functions). And this way we again actually catches the problem across all types of models, and not just the few differentials + many algebraic equations one.

TorkelE avatar Apr 10 '25 09:04 TorkelE

So there are cases where one option makes sense and another doesn't? In Sys bio the main reason for observables is pretty much the opposite (I have 3,000 differential equations, butonly want to plot ~5 observables. So I distinctly specify these things to plot to avoid the crash).

The default plot is also not great for things like SBML reads where it can also crash your computer 😅, but including the observed by default only compounds to it. So I don't think we should plot the observables by default, but we should document an easy to do it. I don't know if we do.

If our goal is to avoid accidental plotting of too much stuff, having a some heuristic/limit seems like the correct solution right. By default never plot more than 20/100/some other number of quantities, that sounds sensible. Anyone who does this (whether this is due to them having many differential/algebraic equations, or both) are probably making a mistake (and again one can plot all quantities using the idxs = unknowns(sys) and similar functions). And this way we again actually catches the problem across all types of models, and not just the few differentials + many algebraic equations one.

Yes we should probably limit it by default to the first 100 indices and throw a warning.

ChrisRackauckas avatar Apr 10 '25 11:04 ChrisRackauckas

One could even imagine an heuristic where:

  • If fewer than 25 unknowns + observables, plot everything.
  • If fewer than 25 unknowns, but more than 1000 observables, plot only unknowns.
  • If fewer than 25 observables, but more than 100 unknowns, plot only observables.
  • If more than 100, plot the first hundred. Basically we say that, if idxs is not used, MTK automatically tries to figure out what things the user might want to plot.

In practise it might be sensible not to meddle too much within, and as Chris said, only plot the first 100 (and in this case I'd think one might as well throw the observables in here, at least in certain cases, but that is my opinion). If there is more than 100, throw a warning telling the uer that they probably want to plot everything).

Then we add a function obssyms(sys) which extracts the observable symbols so one can do something like idxs = obssyms(sys) (or even have a Symbol option which the plot recipe automatically interprets).

TorkelE avatar Apr 10 '25 11:04 TorkelE

To me it makes most sense that the plotting (continues to) show a minimal representation of the solution, i.e. only the unknowns, by default. But I do think it sounds like a good idea to error/warn if there are more than e.g. 100 and idxs is not explicitly set.

Then we add a function obssyms(sys) which extracts the observable symbols so one can do something like idxs = obssyms(sys) (or even have a Symbol option which the plot recipe automatically interprets).

I think such a function sounds like a very good idea. Personally I think obssyms(sys) is slightly too obscure for public API, and I would prefer something like observeds(sys) for consistency/symmetry with unknowns(sys), which would just be equivalent to observeds(sys) = map(eq -> eq.lhs, observed(sys)). Then I think it becomes very nice and intuitive to write e.g.

using Plots
plot(sol) # maybe this should set idxs = nothing, so we can detect it when idxs is not explicitly set and error/warn
plot(sol, idxs = unknowns(sys))
plot(sol, idxs = [unknowns(sys); observeds(sys)])
# ...

hersle avatar Apr 10 '25 12:04 hersle

I think such a function sounds like a very good idea. Personally I think obssyms(sys) is slightly too obscure for public API, and I would prefer something like observeds(sys) for consistency/symmetry with unknowns(sys), which would just be equivalent to observeds(sys) = map(eq -> eq.lhs, observed(sys)). Then I think it becomes very nice and intuitive to write e.g.

I agree we've been missing this one. I've been writing map(eq -> eq.lhs, observed(sys)) while debugging really often 😅

ChrisRackauckas avatar Apr 10 '25 12:04 ChrisRackauckas

A potential solution here (instead of having MTK count variables/observables to figure out the best behaviour) could be to use the new system metadata interface, i.e. we could have a metadata which designates that a system should use a specific plotting behaviour. I.e. we could add a metadata to Catalyst ReactionSystems (and their derived Systems). When MTK encounters this it knows that observables should be printed as well.

TorkelE avatar Jul 01 '25 09:07 TorkelE

Yeah I think having some form of problem metadata that allows front ends to set some defaults would be fine. We kind of have that for the solve by allowing kwargs in the ODEProblem, but those aren't passed to the plotting functionality. Definitely not unreasonable.

ChrisRackauckas avatar Jul 03 '25 14:07 ChrisRackauckas