Catalyst.jl icon indicating copy to clipboard operation
Catalyst.jl copied to clipboard

supporting SBML efforts

Open isaacsas opened this issue 3 years ago • 14 comments

@paulflang I wanted to open an issue in Catalyst to track anything we can do to help with your SBML efforts. I looked at https://github.com/SciML/CellMLToolkit.jl/issues/15 and noticed you mentioned a few things:

  1. Compartments
  2. Volumes

Is there anything else that are potential issues?

isaacsas avatar Mar 08 '21 22:03 isaacsas

We regards to those two, we would like to head towards allowing spatial structures, so including compartments in some way would ultimately make sense. I'm not sure if the best approach right now is to add a compartment(s) field/tag, or instead expecting users to build separate ReactionSystems for each compartment and then using the newer compositional modeling functionality of ModelingToolkit to link the systems together. (We don't yet support this in Catalyst for ReactionSystems, but I hope we can get it going soon.)

With regards to volumes; can't the Reactions be generated with a volume parameter in the rate constants, which if set to one allows for units of concentration but otherwise when given a dimensional value converts the rate constant to units of probability per time when used in stochastic simulations? Or is it more complicated in what SBML could provide us?

Finally, what were the stochastic simulation issues you were encountering (or was it just this volume/compartment issue)?

isaacsas avatar Mar 08 '21 22:03 isaacsas

Thank you very much for helping with our SBML efforts! Greatly appreciated.

First, I have to ask the following: what is the dimension of kinetic laws in a ReactionSystem? My understanding was that it is 1/time. SBML and afaik BNGL handle it that way. They do so, because otherwise handling compartments of different size would become very cumbersome. What is the dimension of a variable? To me it seems implicitly assumed that it is a number (or equivalently, a concentration, where the assumption for microscopic to macroscopic rate law conversion is made that N_avogadro = 1/Volume, for instance N_avogadro = Volume = 1).

Second, I should maybe mention that a SBML species is a particle in a compartment. So if particle A moves from compartment X to compartment Y it becomes a different SBML species. The modeller would have to assign different names to A in different compartments, for instance A_x and A_y.

Now to your comment: If the user input is in form of microscopic rates (or equivalently, macroscopic rates where N_avogadro = 1/Volume, for instance N_avogadro = Volume = 1) stochastic simulations should run without any problems, even in models with different compartment sizes. If a reaction transports n particles from compartment X to Y, then X has n particles less and y has n particles more. This should also work for ODE models, as long as we stick to the same units of ("continualised") particle numbers. Problem is when we use units of concentrations with N_avogadro != 1/Volume.

can't the Reactions be generated with a volume parameter in the rate constants

I don't think so. Because if we use concentrations, and X is twice as large as Y, then A would appear two times faster in Y than it disappears from X. So assigning a single parameter to the reaction to be used in the ODE for A_x and A_y would not work. What we could do, is to not only specify substoich and prodstoich in a MTK.Reaction but also add volumes (which in this case would for instance be [2,1]. But that would have to be added to every MTK.Reaction. This duplicate volume specification has huge potential for users to introduce errors/contradictions. And furthermore, this would still require adding a new field to one of the MTK types (in this case Reaction instead of ReactionSystem).

using the newer compositional modeling functionality of ModelingToolkit

I have not looked into this functionality yet, but I also see a problem here. For instance, one may want to combine a metabolic submodel with a signalling network submodel to a larger integrative model. If there is no way to specify compartments other than creating submodels, we would have, say a nuclear and a cytoplasmic model nested under both, the metabolic submodel and the network submodel. Again, this gives potential for the user to introduce contradictions wrt compartmental volumes.

On the other hand, if a MTK.ReactionSystem would optionally contain a compartment volume specification (only needed if there is more than one compartment in the model anyway) the compartment volume would be specified only once, avoiding contradictions. The notion of compartment volume would allow to convert the output solution of an ODEProblem to concentrations, without having to assume unit volume.

paulflang avatar Mar 09 '21 19:03 paulflang

The units are whatever the user's parameters, initial condition, and time variable implicitly define when they construct the concrete problem type. I just tried a couple examples with homodimerization reactions with Unitful using micro-Molar concentrations and seconds, and everything seemed to work fine for ODEs (i.e. the solution that comes out has the right units). Jump models wouldn't be right since they implicitly assume units of "number of" in the rate law, but ODEs (and I think SDEs) should not have a unit restriction as a far as I am aware. I don't think we should require users who are only using the ODE functionality to work in units of number of.

I don't think so. Because if we use concentrations, and X is twice as large as Y,

Sorry if I wasn't clear. I meant that the rate constant for a given reaction should be scaled by the volume of the compartment it is defined in (I'm imagining separate Reaction instances for separate compartments). If you are encoding species/reactions in two compartments then you should make two volume parameters, one for each compartment (i.e. V_N and V_C for nucleus and cytosol), and use them appropriately when defining rate constant expressions within Reactions.

If we become responsible for handling volumes at the ReactionSystem level, don't we then need to know the user's units (for both the species and the rate constants). For example, if the user wants to use units of concentration but has rates in units of time we need to setup one set of volume scalings, while another user who wants to use units of number of but has rates in macroscopic units (i.e. inverse concentration inverse time for bimolecular reactions) would require us to introduce a different scaling.

This feels to me like something best handled in higher level tooling (for example in Catalyst), or directly by users (but I'm happy to be convinced otherwise!). i.e. whatever sets up the ReactionSystem(s) ensures there are shared volume parameters for each compartment, with rate constants scaled appropriately for the units that will be used. (~~And maybe~~ we should add a nice way to set up compartments and/or specify a set of units for which we handle the volume scaling.)

@ChrisRackauckas @TorkelE any thoughts?

isaacsas avatar Mar 10 '21 01:03 isaacsas

It does seem like it's starting to touch on something higher level about shared structures and spatial systems. Though it's similar in nature to things like noise scaling and diagonal noise SDE approximations. I think a scaled_ODESystem(reactionsys,volumnes=reactionsys.volumes) could make sense where it goes in and scales everything, and then volumes could be a field in the reactionsys to easily parse from SBML. IIUC that would be required as the rate itself doesn't determine the rate constants of the ODE without this (in the SBML form).

ChrisRackauckas avatar Mar 10 '21 04:03 ChrisRackauckas

I guess at that point we could just have the volumes in the ReactionSystem (default to 1) and divide the out by default in convert(ODESystem and convert(SDESystem

ChrisRackauckas avatar Mar 10 '21 04:03 ChrisRackauckas

On the topic of spatial things, I've always imagined that the natural way to go is to have one ReactionSystem in each compartment (In most cases probably the same ReactionSystem).

I imagine most people using ReactionSystem would not consider volumes, but if one needs to consider it I guess the ReactionSystem would be the natural place (or alternatively, if we start increasingly need to add stuff, some SBML Reaction System structure, containing a much wider definition of a model). If so, some trivial value like 1 (or nothing, and a check ensuring it is ignored if not set to something else) seems the best way to go?

(caveat: I'm not very familiar with SBML)

TorkelE avatar Mar 10 '21 11:03 TorkelE

I like the idea of keeping ReactionSystems as minimial and barebones as needed. This will make it easier in defining future transforms since there are less things to worry about having to handle. This is to some degree the difference between things like BNGL (which is very expressive), and the simpler .net representation used by Bionetgen (which is really just a list of reactions with some parameters formulas).

@paulflang what volume scaling do you want / need for SBML in ReactionSystems? i.e. is the assumption that rates are always in concentration units, so we need to sometimes convert to per time? Or vice-versa? (I understand about the compartment issue, but don't quite see if there is a "unique" volume scaling that is always needed.)

isaacsas avatar Mar 10 '21 13:03 isaacsas

I also like the idea of keeping ReactionSystems as minimal and barebones as needed. In particular, like SBML and also BNGL, a ReactionSystem description should stay agnostic about the algorithm (SSA or ODE) it is simulated with, even if the described network contains transport of particles between compartments of different size. To achieve this, SBML and BNGL internally represent rate laws as extensive properties, rather than intensive properties (quantity/time rather than quantity/size/time). The reasoning can be found on p154 here and on p76-78 here). Therefore, I totally agree with Chris when he says:

I guess at that point we could just have the volumes in the ReactionSystem (default to 1) and divide the out by default in convert(ODESystem and convert(SDESystem

I see no need to bother with adding units to a ReactionSystem. The only two things I'd like to add are

  • that we need to know which species resides in which compartment to divide by the correct volume. So maybe the states field in MTK.ReactionSystem should accept variable => compartment pairs (like SBML reaction systems and unlike BNGL rules, I suggest a many-to-one mapping from species to compartments).
  • that we need to decide how u0 is interpreted when fed into a DEProblem. Always as extensive property, or problem type dependent sometimes as intensive property (e.g. for and ODEProblem).

paulflang avatar Mar 10 '21 14:03 paulflang

Hi, Just trying to catch up on this. How shall we best proceed?

paulflang avatar Mar 18 '21 01:03 paulflang

If I understand correctly from the SBML document, is what you are looking for the ability to pass a species-dependent scaling factor that simply multiplies the corresponding ODE/SDE rhs (what about jumps?)? i.e. what they call conversionFactors?

For now jumps are definitely extensive, but for ODEs and SDEs we should continue to allow users to use whatever they want. I can imagine in some non-bio fields it would be very unusual to not work in concentration units, where the number of molecules in a reaction may be enormous.

So I think that leaves explicit representations of compartments (instead of just generating all the individual reactions yourself). I’m hoping to have a system composition PR together soon if I can get a little time to work on it, so perhaps then you could take a look and see if that would work with what you need from SBML.

isaacsas avatar Mar 19 '21 02:03 isaacsas

what about jumps?

Mhh, this is actually a problem, as we would get non-integer values, right.

Perhaps the cleanest approach would be that we do not directly translate SBML to ReactionSystem, but add a step before that converts all SBML models that are using intensive properties to extensive properties. Than we convert to ReactionSystem and it just works for ODEs, SDEs and jumps without them being aware of volumes. No conversionFactor needed.

I am just not sure atm how straightforward the intensive to extensive conversion would be. Need to look into that.

paulflang avatar Mar 19 '21 15:03 paulflang

Do SBML files ever encode a model that is both for ODEs and jumps, or are they supposed to use separate files? Looking at the document you linked this wasn't clear to me (but I somehow came away with the impression that the type of mathematical model is supposed to be encoded in the file, so there would be separate files for a jump vs. ODE version of a model). In that case could it be expected that SBML jump model files will have units so that a conversion factor is not needed?

isaacsas avatar Mar 19 '21 16:03 isaacsas

My impression is that SBML encodes models without being clear if they are for ODEs or jumps.

  • SBML allows you to set SBO terms to indicate whether a given kineticLaw is for continuous or discrete interpretation, but they are optional and you can put different SBO term to different kineticLaws in the same model.
  • SBML allows you to set u0 of species as initialAmount or initialConcentration or nothing at all (e.g. if these quantities are obtained via initialAssignments)
  • As neither initialAmount nor initialConcentration are required, they cannot be used to infer whether a species, if it appears in a kineticLaw appears as amount or concentration. Therefore, species have the hasOnlySubstanceUnits flag (if true, this species appears as amount, if false as concentration). Again, species in the same model can differ wrt whether this flag is set or not.

Tbh I find SBML unnecessarily flexible in the sense that it allows you to mix units up that do not really belong together. But I suggest prior to importing into ReactionSystem convert all SBML files to a common format using initialAmount and hasOnlySubstanceUnits=true. That is

  • multiply initialAmounts by the volume in which the species resides, put this value in a newly created initialConcentration attribute and delete the initialAmount attribute.
  • with all species now measured in amounts kineticLaws, wherever a species appears with the hasOnlySubstanceUnits flag set to false will be incorrect. In such kineticLaws divide every occurrence of a species by the size of the volume it resides in.

That way we make sure that everything that ends up expressed as a ReactionSystem is and extensive property. I.e. we have taken care of volumes before creating the ReactionSystem and no longer have to bother with it when creating ODE/SDE/JumpSystems. Sorry I did not see it that way earlier, if you've been trying to convince me all the time. Wrt scaling/Avogadro number we would expect that users have chosen the scaling they need when creating the SBML file already. Does all of that seem reasonable?

paulflang avatar Mar 19 '21 20:03 paulflang

Yes, I think the problem you are recognizing is basically what I was trying (perhaps poorly) to explain!

I guess check back here if you do end up needing anything. For example, that conversionFactor constant for ODEs may be difficult to handle through a ReactionSystem, since it doesn't modify reactions but instead modifies an entire ODE. We could add a field to store it and apply it when generating ODE/SDE models if needed. Alternatively, if I can make progress on allowing variables/expressions for stoichiometric coefficients then it could be handled through them.

isaacsas avatar Mar 19 '21 20:03 isaacsas