ModelingToolkit.jl
ModelingToolkit.jl copied to clipboard
Suggestion: Make `plot` plot unknowns AND observables by default (not just unknowns)
When splitting what variables become unknowns and what are observables, I feel like it makes sense to lump them together in the plot (i.e. plotting all by default). I.e. here:
using OrdinaryDiffEqDefault, ModelingToolkit, Plots
using ModelingToolkit: t_nounits as t, D_nounits as D
# Create model.
@variables X(t) Y(t)
@parameters p d k1 k2 = 0.5
eqs = [
D(X) ~ p - d*X,
k1 + Y ~ k2*X
]
@mtkbuild osys = ODESystem(eqs, t)
# Create `ODEProblem`.
u0 = [X => 2.0]
pvals = [p => 1.0, d => 0.1, k1 => 0.5]
oprob = ODEProblem(osys, u0, (0.0, 10.0), pvals)
# Solve and plot it.
sol = solve(oprob)
plot(sol)
only X is plotted (unless one specify Y explicitly). Now from a user perspective there is no real difference here, they don't know why X is plotted and not Y. I think this is one of (if not the only?) cases where structural_simplify have a direct impact on workflows.
Furthermore, changing this would:
- Mean that we can be a lot more liberal with updating the internal algorithms of
structural_simplify. I.e. there could be minor updates to its algorithm that suddenly changes so thatXis eliminated andYis not (probably not in the example above, but in more complicated ones). If the observables are ploted, the user will never notice or case. - The case of
eqs = [D(X) ~ -p]is one that can be solved directly, i.e. the only unknown of the system becomes an observable. Here, if you doplot(solve(prob))you will just get an empty plot plane, which seems really confusing.
I think the other alternative is to just enforcethe use of idxs = in plot. That makes sense for larger models. However, for small models this will be annoying.
Even small models may have 1000s of observables..
I mean, this entierly depends on what you are using MTK for right? For people who does DAEs with very few differential equations and many many algebraic equations, yes. But I (and many others) have very few algebraic equations. Furthermore, when we use them it is typically because it is some quantity that we explicitly want to plot. I don't think we should specialise MTK to cater only to one specific field.
Sure but if you try to plot 1000s of signals Julia will crash, so that's not a reasonable default
So there are cases where one option makes sense and another doesn't? In Sys bio the main reason for observables is pretty much the opposite (I have 3,000 differential equations, butonly want to plot ~5 observables. So I distinctly specify these things to plot to avoid the crash).
Ultimately, to the user, there isn't really any difference to observables/unknowns, they are all hidden stuff anyway? This feels exactly like the cases when you should use idxs = to specify which quantities of you model you do want to plot.
If our goal is to avoid accidental plotting of too much stuff, having a some heuristic/limit seems like the correct solution right. By default never plot more than 20/100/some other number of quantities, that sounds sensible. Anyone who does this (whether this is due to them having many differential/algebraic equations, or both) are probably making a mistake (and again one can plot all quantities using the idxs = unknowns(sys) and similar functions). And this way we again actually catches the problem across all types of models, and not just the few differentials + many algebraic equations one.
So there are cases where one option makes sense and another doesn't? In Sys bio the main reason for observables is pretty much the opposite (I have 3,000 differential equations, butonly want to plot ~5 observables. So I distinctly specify these things to plot to avoid the crash).
The default plot is also not great for things like SBML reads where it can also crash your computer 😅, but including the observed by default only compounds to it. So I don't think we should plot the observables by default, but we should document an easy to do it. I don't know if we do.
If our goal is to avoid accidental plotting of too much stuff, having a some heuristic/limit seems like the correct solution right. By default never plot more than 20/100/some other number of quantities, that sounds sensible. Anyone who does this (whether this is due to them having many differential/algebraic equations, or both) are probably making a mistake (and again one can plot all quantities using the idxs = unknowns(sys) and similar functions). And this way we again actually catches the problem across all types of models, and not just the few differentials + many algebraic equations one.
Yes we should probably limit it by default to the first 100 indices and throw a warning.
One could even imagine an heuristic where:
- If fewer than 25 unknowns + observables, plot everything.
- If fewer than 25 unknowns, but more than 1000 observables, plot only unknowns.
- If fewer than 25 observables, but more than 100 unknowns, plot only observables.
- If more than 100, plot the first hundred.
Basically we say that, if
idxsis not used, MTK automatically tries to figure out what things the user might want to plot.
In practise it might be sensible not to meddle too much within, and as Chris said, only plot the first 100 (and in this case I'd think one might as well throw the observables in here, at least in certain cases, but that is my opinion). If there is more than 100, throw a warning telling the uer that they probably want to plot everything).
Then we add a function obssyms(sys) which extracts the observable symbols so one can do something like idxs = obssyms(sys)
(or even have a Symbol option which the plot recipe automatically interprets).
To me it makes most sense that the plotting (continues to) show a minimal representation of the solution, i.e. only the unknowns, by default. But I do think it sounds like a good idea to error/warn if there are more than e.g. 100 and idxs is not explicitly set.
Then we add a function obssyms(sys) which extracts the observable symbols so one can do something like idxs = obssyms(sys) (or even have a Symbol option which the plot recipe automatically interprets).
I think such a function sounds like a very good idea. Personally I think obssyms(sys) is slightly too obscure for public API, and I would prefer something like observeds(sys) for consistency/symmetry with unknowns(sys), which would just be equivalent to observeds(sys) = map(eq -> eq.lhs, observed(sys)). Then I think it becomes very nice and intuitive to write e.g.
using Plots
plot(sol) # maybe this should set idxs = nothing, so we can detect it when idxs is not explicitly set and error/warn
plot(sol, idxs = unknowns(sys))
plot(sol, idxs = [unknowns(sys); observeds(sys)])
# ...
I think such a function sounds like a very good idea. Personally I think obssyms(sys) is slightly too obscure for public API, and I would prefer something like observeds(sys) for consistency/symmetry with unknowns(sys), which would just be equivalent to observeds(sys) = map(eq -> eq.lhs, observed(sys)). Then I think it becomes very nice and intuitive to write e.g.
I agree we've been missing this one. I've been writing map(eq -> eq.lhs, observed(sys)) while debugging really often 😅
A potential solution here (instead of having MTK count variables/observables to figure out the best behaviour) could be to use the new system metadata interface, i.e. we could have a metadata which designates that a system should use a specific plotting behaviour. I.e. we could add a metadata to Catalyst ReactionSystems (and their derived Systems). When MTK encounters this it knows that observables should be printed as well.
Yeah I think having some form of problem metadata that allows front ends to set some defaults would be fine. We kind of have that for the solve by allowing kwargs in the ODEProblem, but those aren't passed to the plotting functionality. Definitely not unreasonable.