cobrapy icon indicating copy to clipboard operation
cobrapy copied to clipboard

refactor: turn compartments into objects

Open Midnighter opened this issue 7 years ago • 33 comments

This is based on @MaxGreil's work. I simply rebased his branch onto the latest devel and added two methods. I did so because we need compartment objects soon in order to complete the work on the SBML I/O.

I'm not completely done with this but please add your suggestions here @cdiener and @matthiaskoenig.

Midnighter avatar Jun 13 '18 18:06 Midnighter

Codecov Report

Merging #725 into devel will increase coverage by 0.5%. The diff coverage is 89.17%.

Impacted file tree graph

@@            Coverage Diff            @@
##            devel     #725     +/-   ##
=========================================
+ Coverage   82.98%   83.48%   +0.5%     
=========================================
  Files          40       42      +2     
  Lines        3890     4008    +118     
  Branches      891      920     +29     
=========================================
+ Hits         3228     3346    +118     
+ Misses        463      462      -1     
- Partials      199      200      +1
Impacted Files Coverage Δ
cobra/manipulation/modify.py 91.39% <ø> (ø) :arrow_up:
cobra/io/yaml.py 86.66% <ø> (+33.33%) :arrow_up:
cobra/core/metabolite.py 76.19% <100%> (+5.17%) :arrow_up:
cobra/util/util.py 100% <100%> (ø) :arrow_up:
cobra/io/schemata/__init__.py 100% <100%> (ø)
cobra/io/sbml3.py 81.4% <100%> (ø) :arrow_up:
cobra/io/mat.py 68.2% <100%> (ø) :arrow_up:
cobra/io/sbml.py 65.77% <71.42%> (ø) :arrow_up:
cobra/core/compartment.py 83.33% <83.33%> (ø)
cobra/io/dict.py 88.09% <87.5%> (-0.37%) :arrow_down:
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 0fc8ea2...37d33c8. Read the comment docs.

codecov-io avatar Jun 13 '18 18:06 codecov-io

Sorry, didn't have time to test the branch but for me the most important parts would be the following two:

  1. Backwards compatibility. SBML, JSON and YAML models without compartments should still work.
  2. model.compartments should be a dynamic attribute that is generated on request from the compartments of the reactions. Otherwise we it will be very hard to synchronize the addition and deletion of new reactions with that attribute. So basically Model has a private list of annotated compartments and but the attribute shows only those that are found in reactions and adds minimally annotated ones for compartments that appear in reactions but not in the private annotated list.

If those two are met I think the PR is ready to go :smile:

cdiener avatar Jun 14 '18 17:06 cdiener

I'll try to summarize what is still required for this to be 'done':

  • [x] update all example notebooks
  • [x] revert GENE ASSOCIATION back to GENE_ASSOCIATION?
  • [ ] ~check the import and export of models to SBML (with and without compartments)~
  • [x] check the import and export of models to JSON (with and without compartments)
  • [x] move existing JSON schema from python module to separate JSON file, version in the schema should be a string not an integer, make new schema that contains new requirements on compartments
  • [x] check the import and export of models to YAML (with and without compartments), whatever is generated here should be identical to JSON and also be backwards-compatible
  • [x] model.compartments should be a dynamic attribute that is generated on request from the compartments of the metabolites VS. model.compartments is a static list that is updated whenever a) metabolites are added or removed, b) a compartment is added or removed, c) a reaction is added or removed, d) the compartment on a metabolite is changed, e) user feedback ought to be given when non-empty compartment is removed.

ChristianLieven avatar Aug 15 '18 12:08 ChristianLieven

This should be good to go now. Please let me know if anything is unclear.

I'm looking forward to comments and feedback.

ChristianLieven avatar Aug 22 '18 14:08 ChristianLieven

Wondering if we could address #747 with that at once (see last comments there). If we could add a way to mark a compartment as external I could integrate that into the media module.

cdiener avatar Aug 23 '18 15:08 cdiener

Fix #652 Fix #696

Midnighter avatar Sep 06 '18 09:09 Midnighter

What's a use case for the model.add_compartments and model.remove_compartments methods? Those seem like complex pieces of code (especially if we wanted to support contexts). Is it mainly to edit overall model structures, or are there interactive uses where you'd want to remove a compartment, run a simulation, and add it back in? If its the former, how would you feel about those living in cobra.manipulation?

pstjohn avatar Sep 10 '18 21:09 pstjohn

Hi all, what is going on with this pull request? Is this up to date? M

matthiaskoenig avatar Mar 18 '19 09:03 matthiaskoenig

@pstjohn

What's a use case for the model.add_compartments and model.remove_compartments methods? Those seem like complex pieces of code (especially if we wanted to support contexts). Is it mainly to edit overall model structures, or are there interactive uses where you'd want to remove a compartment, run a simulation, and add it back in? If its the former, how would you feel about those living in cobra.manipulation?

We have a use case for it; just posted here: https://github.com/opencobra/cobrapy/issues/760#issuecomment-491934045

I'm happy to give more details!

zakandrewking avatar May 13 '19 18:05 zakandrewking

Hi all, I have a need for the compartment objects and was wondering when they would be integrated into the stable/master branch of COBRApy.

Additionally, I noticed that current compartment class does not have a size or volume attribute. Just fyi, for my use case (and general SBML compatibility) it would be helpful if I could set the size of the compartment.

z-haiman avatar Jun 20 '19 16:06 z-haiman

Same here. Also have a need for compartment objects. Would be great to have these in cobrapy

  • I want to be able to read the annotations and work with them
  • There should be memote tests on the compartments which are only possible if these are first class objects (with annotations). See https://github.com/opencobra/memote/issues/647

matthiaskoenig avatar Jun 23 '19 20:06 matthiaskoenig

Hi all, can we merge this? I urgently need first class compartment objects for the last 2 years :) No seriously, can we do this. This is especially important for annotations and other information on compartments.

Best Matthias

matthiaskoenig avatar Jul 17 '20 10:07 matthiaskoenig

I also want this feature a lot. The complications that have been blocking this are:

  • Handling compartment addition to/removal from a model adequately.
  • (Allow copying of compartments from one model to another. This could be a later feature as we'd have to figure out what this means.)
  • Correctly handle addition and removal when working in a model context.

Midnighter avatar Jul 17 '20 11:07 Midnighter

Hi all,

Is there a timeline associated with the completion and integration of the compartment class that can be shared?

I could use really use compartment objects for constraint-based/dynamic modeling workflows, and would be happy to help out in any way I can.

Best,

Zack

z-haiman avatar Aug 12 '20 20:08 z-haiman

Hey @z-haiman, I can't give you an exact timeline since we don't have any full time developers. We're getting some important things done recently, though, so this is becoming feasible to do and I've put this on our roadmap.

Midnighter avatar Aug 13 '20 09:08 Midnighter

Would you like me to try and tackle updating this? If so, what needs to be done besides getting it to the most recent devel and making it Python 3.6+ compliant?

akaviaLab avatar Mar 10 '22 22:03 akaviaLab

The main thing I got stuck on was to come up with a good design for our "undo" functionality, i.e., when you modify a model in a context, those actions are undone again upon leaving the context. This is more complicated with compartments when you add a compartment from one model to another one. Does that mean you also add all metabolites & reactions in that compartment from the other model?

I'm probably overthinking this a bit but didn't have the chance to discuss different design choices so far. If you want to take this on, I suggest we have an online meeting and come up with a good design.

Midnighter avatar Mar 11 '22 09:03 Midnighter

Hi @Midnighter We can do zoom, Skype, or I can email the list and we can have a conversation about the architecture and the features needed. Let me know what you'd like please.

akaviaLab avatar Mar 23 '22 14:03 akaviaLab

  • Adding a compartment from another model should just add the compartment.
  • Do we handle compartments on reactions in cobrapy? I.e. not only species can have a compartment, but also reactions. Not sure this is currently supported in cobrapy, but it is part of SBML and important information, e.g. for layout algorithms.

matthiaskoenig avatar Mar 24 '22 09:03 matthiaskoenig

We can do zoom, Skype, or I can email the list and we can have a conversation about the architecture and the features needed. Let me know what you'd like please.

Maybe this should be discussed at a dev meeting? @cdiener wanted to schedule one soon.

Adding a compartment from another model should just add the compartment.

Should it? I mean, it could be super convenient to say, copy mitochondrion from one model to another. Of course, it's not hard to loop over compartment contents and add those. Maybe Model.add_compartment should have a recursive flag or so to do that?

Do we handle compartments on reactions in cobrapy? I.e. not only species can have a compartment, but also reactions. Not sure this is currently supported in cobrapy, but it is part of SBML and important information, e.g. for layout algorithms.

Not directly, for MEMOTE we looked at the metabolites of a reaction to figure it out. How are transport reactions handled in SBML using this?

Midnighter avatar Mar 24 '22 11:03 Midnighter

For the transport reactions you just add a compartment such as id="pm" name="plasma membrane" spatialDimensions="2" and then set transport_reaction.compartment=pm. This also allows nicely to define at which interface compartments transporters are located. The spatialDimensions information is especially important in scaling of systems, because volume and area scale differently, e.g. if a cell growths. So you would scale proteins differently if a cell growths.

I agree, copying of the complete compartment is nice to have, but should not be the default behavior.

Dev meeting would be great. Would be happy to participate.

matthiaskoenig avatar Mar 24 '22 11:03 matthiaskoenig

Sorry forgot. There are also many membrane bound reactions which catalyze on one side of the compartment. I.e. the enymes are associated with the membrane, but all the metabolites involved are on one side of the membrane. Inferring compartments from metabolites gives incorrect compartments here.

matthiaskoenig avatar Mar 24 '22 11:03 matthiaskoenig

Sorry forgot. There are also many membrane bound reactions which catalyze on one side of the compartment. I.e. the enymes are associated with the membrane, but all the metabolites involved are on one side of the membrane. Inferring compartments from metabolites gives incorrect compartments here.

I don't see yet how this gives the wrong information. If an enzyme is membrane bound but carries out the reaction only in one compartment, that is where the reaction happens, right?

Midnighter avatar Mar 24 '22 13:03 Midnighter

Agree with @Midnighter here. Reaction is a bit of an abstract concept but if you go by physical chemistry it is a collision that passes an energy threshold so it requires physical contact. So the actual reaction would still take place where the substrate is located. For channels the initial adsorption/entry takes place on one side of the membrane and secretion/export on the other side. The substrate and product would have different compartments and the reaction would have two compartments due to the multi-step nature. Which is how it is handled in cobrapy right now. If you want to specifically model what happens with the molecule inside the channel you have to add a species in the membrane compartment as well. I totally get that enzymes can have a single location in that case but reactions are not exactly the same thing.

cdiener avatar Mar 24 '22 14:03 cdiener

Okay. I'd be happy to participate in some kind of debate in a developer meeting if you'll let me know when it is. I've tried updating this PR request, and it is a bit of a mess - too many changes for it to be easily mergable. My planned approach is to use this as a framework and starting point, and just reapply the changes on the new dev.

I've also been thinking a bit about the computational format of compartments, and my approach, based on my biological knowledge and intuition is

  1. Something similar in code and functionality to groups, which includes metabolites that are in the compartment.
  2. We can set the framework to include the possibility of genes/proteins to be included, but not set it up as default. That should give the possibility of having proteins in certain places, like what @matthiaskoenig mentioned about proteins, but not set it up as default, since it is somewhat complicated for models that don't include enzyme location.
  3. Reactions will be derived from the metabolites. Something like
def @reactions
   rxnSet = set()
   for met in self._metabolites:
      rxnSet.update(met._reactions)
   return frozenset(rxnSet)

That should comply with what @cdiener and @Midnighter pointed out. 4) I like having compartments be dictlists, since then you can do various queries on the metabolites. That will be useful if for example, I want to split a compartment into two - it will let me find all metabolites of a certain type ,and move them away. 5) Compartments should be able to be copied from models, but won't have any members. Copying from other models with metabolites will lead to problems if these metabolites don't already exist in the model. If you want to copy the members, use the query/get_by on the original compartment, and then the result will be added to the new compartment. Making it explicit will make people have to be aware of the metabolites. 6) Compartments can be merged (so the new one has all metabolites of old ones), added, deleted from model. You can also add/remove entities from a compartment. I think groups already has context set up for most of these, so the code can be adapted for whatever compartments end up being.

akaviaLab avatar Mar 24 '22 18:03 akaviaLab

@cdiener and @Midnighter: For scaling of reaction bounds based on RNA or protein data the incorrect compartment will give the incorrect scaling. I.e. proteins (reactions) in the membrane should be scaled based on Area, not volume. For this to work the reactions should be assigned the correct location, i.e. it happens at the membrane. I.e. See the following paper for discussion in the kinetic context: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-020-03913-8

However, many processes happen at membranes and thus not in 3D, but in 2D. The scaling of the rates of these processes poses a special problem, if volumes of compart- ments are changed. They will typically scale with an area, but not with the volume of the involved compartment. However, commonly, this is neglected when setting up models and/or volume scaling also sometimes automatically happens when using modelling software in the field.

This is a big issue when using Data to set bounds for reactions, or also to just account for dilution with cell growth (e.g. in approaches such as RBA). Because cobrapy is meant as a core library on which other packages can build more complex analysis the correct handling of this cases would be important. If cobrapy is just a simple tool for FBA, never mind.

Similar issues exist for visualization algorithms using the information in SBML. The important information about the localization of the reaction at the membrane is lost and the network is visualized incorrectly. Yes, it makes sense to infer localization from metabolites if no localization information is available for the reaction. But if the user provides information this should not be discarded. I.e. it must be possible to set the compartment of a reaction and existing information should not be overwritten by heuristics.

matthiaskoenig avatar Mar 25 '22 12:03 matthiaskoenig

@matthiaskoenig if the reaction happens in the membrane I think everybody agrees. But the argument from the paper seems incomplete as only one component is bound to the membrane but the other may still diffuse freely in the cytosol. If you scale the cell volume up this would still enter as a volumetric measure for that second freely diffusible component. And regarding the scope of cobrapy: as the name implies we provide infrastructure for constraint-based modeling which is admittedly not very well defined but I would say that kinetic models and (general) visualization would generally not fall within that scope. However, I do agree that information should not be dropped and we could save that either in the annotations or a separate attribute. Though it would be nice if SBML could support multiple compartments for an enzyme since membrane-bound components or channels do have parts embedded in several compartments. I am also not super eager of allowing that reactions may involve metabolites that have no overlap with the enzyme compartment now. Having a reaction assigned to the nucleus using metabolites from the extracellular space does not look right to me.

cdiener avatar Mar 25 '22 15:03 cdiener

@cdiener @matthiaskoenig I looked at the SBML standard, and it seems that only species have compartments. I think that means that per SBML, only metabolites have compartments, but I could be wrong. Compartments can have any number of dimensions between 0 and 3, so you could set up the membrane as a compartment with two dimensions (and then it would be scaled by area). There also seems to be an ability to set up compartment type and compartment relationships using the multi plugin in SBML. Some questions

  1. Did I understand the SBML standard correctly?
  2. What would you like to have compartments able to be assigned in cobrapy? Metabolites? Genes? Reactions? All of them? Would a compartment be unique to one type of object, or can the same one have genes and metabolites, for example.

akaviaLab avatar Mar 25 '22 15:03 akaviaLab

@akaviaLab you are wrong. Please just look at the SBML specification. You definitely did not look in the document. Reactions have compartments.

So the main question is what to do if the compartment attribute is already set on a reaction or directly assigned. This should have precedence over any heuristics. That is all I am advocating here.

matthiaskoenig avatar Mar 25 '22 15:03 matthiaskoenig

@matthiaskoenig we can't remove the reaction compartments logic in cobrapy right now due to backwards compatibility. Also the interpretation is different and more correct in cobrapy IMO as it actually ensures that reaction compartments are compatible with metabolite compartments (but we can ask the community to decide). But we can definitely store the SBML-assigned compartment and have that assigned as a different attribute like rxn.enzyme_compartment (which is what SBML seems to describe) or in the annotations. SBML writing would than use this compartment.

But either way it's unrelated to the PR and can be addressed in a separate discussion.

cdiener avatar Mar 25 '22 16:03 cdiener