Catalyst.jl
Catalyst.jl copied to clipboard
supporting SBML efforts
@paulflang I wanted to open an issue in Catalyst to track anything we can do to help with your SBML efforts. I looked at https://github.com/SciML/CellMLToolkit.jl/issues/15 and noticed you mentioned a few things:
- Compartments
- Volumes
Is there anything else that are potential issues?
We regards to those two, we would like to head towards allowing spatial structures, so including compartments in some way would ultimately make sense. I'm not sure if the best approach right now is to add a compartment(s) field/tag, or instead expecting users to build separate ReactionSystem
s for each compartment and then using the newer compositional modeling functionality of ModelingToolkit to link the systems together. (We don't yet support this in Catalyst for ReactionSystems, but I hope we can get it going soon.)
With regards to volumes; can't the Reaction
s be generated with a volume parameter in the rate constants, which if set to one allows for units of concentration but otherwise when given a dimensional value converts the rate constant to units of probability per time when used in stochastic simulations? Or is it more complicated in what SBML could provide us?
Finally, what were the stochastic simulation issues you were encountering (or was it just this volume/compartment issue)?
Thank you very much for helping with our SBML efforts! Greatly appreciated.
First, I have to ask the following: what is the dimension of kinetic laws in a ReactionSystem
? My understanding was that it is 1/time. SBML and afaik BNGL handle it that way. They do so, because otherwise handling compartments of different size would become very cumbersome. What is the dimension of a variable? To me it seems implicitly assumed that it is a number (or equivalently, a concentration, where the assumption for microscopic to macroscopic rate law conversion is made that N_avogadro = 1/Volume, for instance N_avogadro = Volume = 1).
Second, I should maybe mention that a SBML species
is a particle in a compartment. So if particle A moves from compartment X to compartment Y it becomes a different SBML
species. The modeller would have to assign different names to A in different compartments, for instance A_x and A_y.
Now to your comment: If the user input is in form of microscopic rates (or equivalently, macroscopic rates where N_avogadro = 1/Volume, for instance N_avogadro = Volume = 1) stochastic simulations should run without any problems, even in models with different compartment sizes. If a reaction transports n particles from compartment X to Y, then X has n particles less and y has n particles more. This should also work for ODE models, as long as we stick to the same units of ("continualised") particle numbers. Problem is when we use units of concentrations with N_avogadro != 1/Volume.
can't the Reactions be generated with a volume parameter in the rate constants
I don't think so. Because if we use concentrations, and X is twice as large as Y, then A would appear two times faster in Y than it disappears from X. So assigning a single parameter to the reaction to be used in the ODE for A_x and A_y would not work. What we could do, is to not only specify substoich
and prodstoich
in a MTK.Reaction
but also add volumes
(which in this case would for instance be [2,1]
. But that would have to be added to every MTK.Reaction
. This duplicate volume specification has huge potential for users to introduce errors/contradictions. And furthermore, this would still require adding a new field to one of the MTK types (in this case Reaction
instead of ReactionSystem
).
using the newer compositional modeling functionality of ModelingToolkit
I have not looked into this functionality yet, but I also see a problem here. For instance, one may want to combine a metabolic submodel with a signalling network submodel to a larger integrative model. If there is no way to specify compartments other than creating submodels, we would have, say a nuclear and a cytoplasmic model nested under both, the metabolic submodel and the network submodel. Again, this gives potential for the user to introduce contradictions wrt compartmental volumes.
On the other hand, if a MTK.ReactionSystem
would optionally contain a compartment volume specification (only needed if there is more than one compartment in the model anyway) the compartment volume would be specified only once, avoiding contradictions. The notion of compartment volume would allow to convert the output solution of an ODEProblem to concentrations, without having to assume unit volume.
The units are whatever the user's parameters, initial condition, and time variable implicitly define when they construct the concrete problem type. I just tried a couple examples with homodimerization reactions with Unitful using micro-Molar concentrations and seconds, and everything seemed to work fine for ODEs (i.e. the solution that comes out has the right units). Jump models wouldn't be right since they implicitly assume units of "number of" in the rate law, but ODEs (and I think SDEs) should not have a unit restriction as a far as I am aware. I don't think we should require users who are only using the ODE functionality to work in units of number of.
I don't think so. Because if we use concentrations, and X is twice as large as Y,
Sorry if I wasn't clear. I meant that the rate constant for a given reaction should be scaled by the volume of the compartment it is defined in (I'm imagining separate Reaction
instances for separate compartments). If you are encoding species/reactions in two compartments then you should make two volume parameters, one for each compartment (i.e. V_N
and V_C
for nucleus and cytosol), and use them appropriately when defining rate constant expressions within Reaction
s.
If we become responsible for handling volumes at the ReactionSystem
level, don't we then need to know the user's units (for both the species and the rate constants). For example, if the user wants to use units of concentration but has rates in units of time we need to setup one set of volume scalings, while another user who wants to use units of number of but has rates in macroscopic units (i.e. inverse concentration inverse time for bimolecular reactions) would require us to introduce a different scaling.
This feels to me like something best handled in higher level tooling (for example in Catalyst), or directly by users (but I'm happy to be convinced otherwise!). i.e. whatever sets up the ReactionSystem
(s) ensures there are shared volume parameters for each compartment, with rate constants scaled appropriately for the units that will be used. (~~And maybe~~ we should add a nice way to set up compartments and/or specify a set of units for which we handle the volume scaling.)
@ChrisRackauckas @TorkelE any thoughts?
It does seem like it's starting to touch on something higher level about shared structures and spatial systems. Though it's similar in nature to things like noise scaling and diagonal noise SDE approximations. I think a scaled_ODESystem(reactionsys,volumnes=reactionsys.volumes)
could make sense where it goes in and scales everything, and then volumes
could be a field in the reactionsys to easily parse from SBML. IIUC that would be required as the rate itself doesn't determine the rate constants of the ODE without this (in the SBML form).
I guess at that point we could just have the volumes
in the ReactionSystem
(default to 1) and divide the out by default in convert(ODESystem
and convert(SDESystem
On the topic of spatial things, I've always imagined that the natural way to go is to have one ReactionSystem
in each compartment (In most cases probably the same ReactionSystem
).
I imagine most people using ReactionSystem
would not consider volumes, but if one needs to consider it I guess the ReactionSystem
would be the natural place (or alternatively, if we start increasingly need to add stuff, some SBML Reaction System structure, containing a much wider definition of a model). If so, some trivial value like 1 (or nothing
, and a check ensuring it is ignored if not set to something else) seems the best way to go?
(caveat: I'm not very familiar with SBML)
I like the idea of keeping ReactionSystem
s as minimial and barebones as needed. This will make it easier in defining future transforms since there are less things to worry about having to handle. This is to some degree the difference between things like BNGL (which is very expressive), and the simpler .net representation used by Bionetgen (which is really just a list of reactions with some parameters formulas).
@paulflang what volume scaling do you want / need for SBML in ReactionSystem
s? i.e. is the assumption that rates are always in concentration units, so we need to sometimes convert to per time? Or vice-versa? (I understand about the compartment issue, but don't quite see if there is a "unique" volume scaling that is always needed.)
I also like the idea of keeping ReactionSystem
s as minimal and barebones as needed. In particular, like SBML and also BNGL, a ReactionSystem
description should stay agnostic about the algorithm (SSA or ODE) it is simulated with, even if the described network contains transport of particles between compartments of different size. To achieve this, SBML and BNGL internally represent rate laws as extensive properties, rather than intensive properties (quantity/time rather than quantity/size/time). The reasoning can be found on p154 here and on p76-78 here). Therefore, I totally agree with Chris when he says:
I guess at that point we could just have the volumes in the ReactionSystem (default to 1) and divide the out by default in convert(ODESystem and convert(SDESystem
I see no need to bother with adding units to a ReactionSystem
. The only two things I'd like to add are
- that we need to know which species resides in which compartment to divide by the correct volume. So maybe the
states
field inMTK.ReactionSystem
should acceptvariable
=>compartment
pairs (like SBML reaction systems and unlike BNGL rules, I suggest a many-to-one mapping from species to compartments). - that we need to decide how
u0
is interpreted when fed into aDEProblem
. Always as extensive property, or problem type dependent sometimes as intensive property (e.g. for andODEProblem
).
Hi, Just trying to catch up on this. How shall we best proceed?
If I understand correctly from the SBML document, is what you are looking for the ability to pass a species-dependent scaling factor that simply multiplies the corresponding ODE/SDE rhs (what about jumps?)? i.e. what they call conversionFactor
s?
For now jumps are definitely extensive, but for ODEs and SDEs we should continue to allow users to use whatever they want. I can imagine in some non-bio fields it would be very unusual to not work in concentration units, where the number of molecules in a reaction may be enormous.
So I think that leaves explicit representations of compartments (instead of just generating all the individual reactions yourself). I’m hoping to have a system composition PR together soon if I can get a little time to work on it, so perhaps then you could take a look and see if that would work with what you need from SBML.
what about jumps?
Mhh, this is actually a problem, as we would get non-integer values, right.
Perhaps the cleanest approach would be that we do not directly translate SBML to ReactionSystem
, but add a step before that converts all SBML models that are using intensive properties to extensive properties. Than we convert to ReactionSystem
and it just works for ODEs, SDEs and jumps without them being aware of volumes. No conversionFactor
needed.
I am just not sure atm how straightforward the intensive to extensive conversion would be. Need to look into that.
Do SBML files ever encode a model that is both for ODEs and jumps, or are they supposed to use separate files? Looking at the document you linked this wasn't clear to me (but I somehow came away with the impression that the type of mathematical model is supposed to be encoded in the file, so there would be separate files for a jump vs. ODE version of a model). In that case could it be expected that SBML jump model files will have units so that a conversion factor is not needed?
My impression is that SBML encodes models without being clear if they are for ODEs or jumps.
- SBML allows you to set SBO terms to indicate whether a given
kineticLaw
is for continuous or discrete interpretation, but they are optional and you can put different SBO term to differentkineticLaws
in the same model. - SBML allows you to set
u0
of species asinitialAmount
orinitialConcentration
or nothing at all (e.g. if these quantities are obtained viainitialAssignment
s) - As neither
initialAmount
norinitialConcentration
are required, they cannot be used to infer whether a species, if it appears in akineticLaw
appears as amount or concentration. Therefore, species have thehasOnlySubstanceUnits
flag (if true, this species appears as amount, if false as concentration). Again, species in the same model can differ wrt whether this flag is set or not.
Tbh I find SBML unnecessarily flexible in the sense that it allows you to mix units up that do not really belong together. But I suggest prior to importing into ReactionSystem
convert all SBML files to a common format using initialAmount
and hasOnlySubstanceUnits=true
. That is
- multiply
initialAmounts
by the volume in which the species resides, put this value in a newly createdinitialConcentration
attribute and delete theinitialAmount
attribute. - with all species now measured in amounts
kineticLaws
, wherever a species appears with thehasOnlySubstanceUnits
flag set tofalse
will be incorrect. In suchkineticLaws
divide every occurrence of a species by the size of the volume it resides in.
That way we make sure that everything that ends up expressed as a ReactionSystem
is and extensive property. I.e. we have taken care of volumes before creating the ReactionSystem
and no longer have to bother with it when creating ODE/SDE/JumpSystems. Sorry I did not see it that way earlier, if you've been trying to convince me all the time. Wrt scaling/Avogadro number we would expect that users have chosen the scaling they need when creating the SBML file already. Does all of that seem reasonable?
Yes, I think the problem you are recognizing is basically what I was trying (perhaps poorly) to explain!
I guess check back here if you do end up needing anything. For example, that conversionFactor
constant for ODEs may be difficult to handle through a ReactionSystem
, since it doesn't modify reactions but instead modifies an entire ODE. We could add a field to store it and apply it when generating ODE/SDE models if needed. Alternatively, if I can make progress on allowing variables/expressions for stoichiometric coefficients then it could be handled through them.