Human-GEM icon indicating copy to clipboard operation
Human-GEM copied to clipboard

Model-based assessment of metabolic functionalities using omics data

Open rasools opened this issue 4 years ago • 36 comments

Description of the issue:

Richelle et al., in a recent research article suggested an approach for interpretation of omics data using GEMs. The key idea behind the suggested approach is to connect changes in transcriptomics/proteomics data to changes in cell functions using defining metabolic tasks. To do this, we first need to define a list of metabolic tasks that human cells can accomplish (similar to defined tasks used by automatic GEM reconstruction approaches such as tINIT) and then extract gene sets associated with each metabolic task. Finally, we need to overly the omics dataset and measure the pathway usage for each metabolic task. Developed functions potentially could be used alongside functions in GeneSetAnalysisMatlab to provide more biological insights from changes in transcriptomics data.

  • [ ] Defining a collection of metabolic tasks.
  • [ ] Detecting the smallest set of reactions required to pass a task within the model.
  • [ ] Extracting associated genes with reactions set associated with each metabolic task.
  • [ ] Developing a scoring framework for enriching detected genes in transcriptomic analysis in metabolic tasks gene sets.

Questions to answer:

  1. The minimum reaction set associated with each metabolic task depends on allocated boundary fluxes. Whats fluxes should be open when we want to find the minimum associated reactions with a metabolic task?
  2. What approach should we use to score genes associated with each metabolic task based on their expression in transcriptomic data?

Expected feature/value/output:

Developing a pack of functions for performing metabolic tasks-based gene set analysis.

I hereby confirm that I have:

  • [X] Checked that a similar issue does not exist already

rasools avatar Jul 19 '21 11:07 rasools

This is a great idea. After comparing Cellfie's tasks with Human1's, you notice that theirs are more likely in human. For example, in Human1's essential tasks, de novo synthesis of nucleotides uses ammonia as input. This is more link to bacterial metabolism, not human. Did you think about integrating Cellfie's tasks into your task collection? I started importing Cellfie tasks so that we could use them with the functions developed for Human1. This is my current attempt : CONSENSUS_TASKS_tINIT_2.xlsx However, some tasks are currently not working. If it is something you are looking into, we could try to make it work in Human1.

cherkaos avatar Aug 20 '21 15:08 cherkaos

@cherkaos thanks you for the nice input.

Did you think about integrating Cellfie's tasks into your task collection? I started importing Cellfie tasks so that we could use them with the functions developed for Human1. This is my current attempt : CONSENSUS_TASKS_tINIT_2.xlsx However, some tasks are currently not working. If it is something you are looking into, we could try to make it work in Human1.

It would be beneficial to integrate CellFie tasks to Human-GEM for general usage. Please upload it into folder data/metabolicTasks/ with a PR, in which the source citation and involved adjustments are explicated.

haowang-bioinfo avatar Aug 21 '21 11:08 haowang-bioinfo

@cherkaos and @Hao-Chalmers, thanks for your inputs and suggestions. The first step for addressing the current issue is defining the desired set of metabolic tasks. Sarah, as you have suggested, it would be a good start to first develop a version of Human1 that can satisfy the maximum number of both CellFie tasks and tasks mentioned in data/metabolicTasks/ I will investigate why some CellFie tasks are not satisfied by Human1 and try to solve the problem. By this, probably we can also generate a list of suggestions for improving Human1.

rasools avatar Aug 21 '21 12:08 rasools

Great, I'm glad to hear that it is in your interest. @Rasools - So far, I've only converted the Cellfie's metabolites (Recon Ids) into Human1. Only Tyr_ggn could not be converted (Task 37 Glycogen Biosynthesis). I also converted the compartment [x] into [s] as it was not accepted as input in Human1. Metabolites which were both IN and OUT generated also an error when using checkTasks.m, which I removed from OUT. It was mainly H20. @Hao-Chalmers - Sure, I can push the document I shared above in the data/metabolicTasks/. However, it is not working in its current form. Is that okay?

cherkaos avatar Aug 23 '21 09:08 cherkaos

@Hao-Chalmers - Sure, I can push the document I shared above in the data/metabolicTasks/. However, it is not working in its current form. Is that okay?

@cherkaos it's okay to begin with this, which can be improved in follow-up PRs.

haowang-bioinfo avatar Aug 23 '21 09:08 haowang-bioinfo

@cherkaos another option is to mark the PR as a draft if you prefer, and then mark it as ready for review whenever you feel it has reached that state.

mihai-sysbio avatar Aug 23 '21 11:08 mihai-sysbio

By the way, I noticed some strange results using essential tasks (data/metabolicTasks/metabolicTasks_Essential.xlsx) to create the models and the full tasks (data/metabolicTasks/metabolicTasks_Full.xlsx) for functional comparison. For example, heme biosynthesis is in both lists but it was reported in the Human1 manuscript that it doesn't pass in blood. How could that be if it is essential ? Blood_HemeBiosynthesis

@Rasools @JonathanRob

cherkaos avatar Aug 25 '21 13:08 cherkaos

@cherkaos

For example, heme biosynthesis is in both lists but it was reported in the Human1 manuscript that it doesn't pass in blood. How could that be if it is essential ?

This is due to a minor difference in the heme biosynthesis tasks between the two task files. I suspect it is because the task in the "full" task list requires production and excretion (to the extracellular compartment) of heme, whereas the "essential" task only requires that it is produced (in the cytosol).

I'm not saying that this was intentional or how I believe it should be formulated - we simply implemented these task lists as provided. But this highlights some issues with this approach, in particular some inconsistencies/overlaps between the two lists that likely could use some curation.

JonathanRob avatar Aug 25 '21 16:08 JonathanRob

Thanks for the clarification!

cherkaos avatar Aug 26 '21 11:08 cherkaos

This is an issue I am very interested in. I think one question to reconcile is how we define and check for tasks, since the Richelle et al paper does it without relaxing the pseudo steady state assumption and only constrains inputs/outputs at the flux level.

CadavidJoseL avatar Sep 29 '21 18:09 CadavidJoseL

@CadavidJoseL the tasks collected to Human-GEM are normally defined according to textbook and/or published literature. They are checked by checkTasks function that basically simulate model with inputs/outputs constrains as defined in a task file .

haowang-bioinfo avatar Sep 29 '21 20:09 haowang-bioinfo

Thanks for your answers, those issues were clear to me. I should have been clearer: What I mean is that the way tasks are checked in this related paper by Richelle et al (and in COBRA toolbox in general) and in the RAVEN toolbox are slightly different in terms of how the LP is set up: "We also propose to define a metabolic task as the capacity of producing a defined list of output products when only a defined list of input substrates is available. However, we modified the way to implement it from the RAVEN toolbox. Instead of relying on the relaxation of the steady-state assumption, we take an approach more similar to that proposed by [14] by imposing constraints only at the flux level. Therefore, a model successfully passes a task if the associated LP problem is still solvable when the sole exchange reactions allowed carrying flux in the model are temporary sink reactions associated with each of the inputs and outputs listed in the task". I will check whether the LPs are equivalent, but maybe this can be a source of discrepancy?

CadavidJoseL avatar Sep 30 '21 05:09 CadavidJoseL

I will check whether the LPs are equivalent

Look forward to the comparison

haowang-bioinfo avatar Sep 30 '21 05:09 haowang-bioinfo

Hi @CadavidJoseL, I think I partially answered this question in the Gitter chat, but let me know if not. It's true that the checkTasks and tINIT algorithms modify the pseudo steady state assumption (i.e., the b value) for metabolites rather than adjusting reaction flux bounds. The optimization problem works out to be effectively the same for this approach as for the flux-based method, since they're both testing feasibility of the LP for each task. It should be relatively straightforward to implement the tasks from Richelle et al. using the same formulation as expected by checkTasks, just may require some additional testing.

JonathanRob avatar Sep 30 '21 05:09 JonathanRob

Hi @CadavidJoseL I started doing the import from Cellfie Consensus Tasks to Human-GEM but a non-negligible number of them are failing (not just the ones that are supposed to fail) and I was wondering why. I originally thought it was due to compartments but maybe the point you raised about the differences in checking tasks also matters. Hi @JonathanRob - Thanks for the explanation. Where is the Gitter chat? Would like to see what you wrote.

cherkaos avatar Sep 30 '21 13:09 cherkaos

@cherkaos thank you Sara for translating tasks from Cellfie paper to be tested by Human1. I checked these tasks on Human1 to see how many of them can reproduce expected results (pass/fail). Out of 195 tasks included in Cellfie Consensus Tasks, 168 of them passed (86%), while 17 of them are failed and 10 tasks generate errors.

image

Error-making tasks are mainly because of undetected metabolites either among substrates or products of the task.

However, failed tasks probably need more investigations for finding the reason and each could be treated as a separate issue. Here is a list of failed tasks based on my analysis. Could you please share your results to see if it is a similar list of failed tasks?

Task status Task name
FAIL Deoxyguanosine triphosphate synthesis (dGTP)
FAIL Deoxyuridine triphosphate synthesis (dUTP)
FAIL Deoxythymidine triphosphate synthesis (dTTP)
FAIL 3'-Phospho-5'-adenylyl sulfate synthesis
FAIL Degradation of guanine to urate
FAIL Conversion of 1-phosphatidyl-1D-myo-inositol 4,5-bisphosphate to 1D-myo-inositol 1,4,5-trisphosphate
FAIL Arginine synthesis
FAIL Aspartate synthesis
FAIL Synthesis of taurine from cysteine
FAIL Glutamate synthesis
FAIL Glutamine synthesis
FAIL Glycine synthesis
FAIL Conversion of lysine to L-2-Aminoadipate
FAIL Methionine degradation
FAIL Tyrosine synthesis (need phenylalanine)
FAIL Triacylglycerol synthesis
FAIL Synthesis of palmitoyl-CoA

The other question is that all the defined tasks in the list are set to pass. Do you confirm it based on defined tasks in Cellfie paper? I didn't check the tasks there and just used tasks listed in Cellfie Consensus Tasks.

rasools avatar Oct 04 '21 15:10 rasools

@Rasools Great. Thank you for testing. I tried using the latest Human-GEM. I had 184 tasks passed (94%), 10 errors and 1 failed. Maybe our differences in failed tasks come from the different GEM versions. Which one did you use?

'Error-making tasks are mainly because of undetected metabolites either among substrates or products of the task.' I agree. But I think it has to do with localization. Do you think we should change compartments? What are your ideas on that end?

Task status Task name
Error Glycogen biosynthesis
Error Glycogen degradation
Error Starch degradation
Error Taurochenodeoxycholate synthesis
Error Glycochenodeoxycholate synthesis
Error tauro-cholate synthesis
Error glyco-cholate synthesis
Error Synthesis of bilirubin
Error Glucosaminyl-acylphosphatidylinositoll to deacylated-glycophosphatidylinositol (GPI)-anchored protein
Error Biosynthesis of g3m8masn
Fail Tyrosine synthesis (need phenylalanine)

Yes, I confirm. All these 195 tasks should pass. I wanted to clarify that I was using Cellfie's list which contains tasks that should pass, in comparison to Human-GEM's full tasks where some should fail.

cherkaos avatar Oct 05 '21 22:10 cherkaos

@cherkaos, thanks for sharing the results. Yes, I am using the same model for checking tasks. Because error-making tasks are similar in our results let's first focus on failed/passed tasks to find what is the origin of the difference in our results. Have you added boundary metabolites to the model prior to checking tasks? If you have boundary metabolites in the model, the total number of metabolites should be 10035. But, if boundary metabolites are not included in the model, the number of metabolites is 8370.

rasools avatar Oct 06 '21 08:10 rasools

No, I haven't added boundary metabolites to the model (I have 8371 metabolites).

cherkaos avatar Oct 06 '21 09:10 cherkaos

@cherkaos, so probably that's why you get passed for almost all tasks that are not error-making. For checking tasks we need to have the model in its closed form (containing boundary metabolites). You can add boundary metabolites to the model using the addBoundaryMets function.

rasools avatar Oct 06 '21 10:10 rasools

Okay I get the same results as you.

Task status Task name
Error Glycogen biosynthesis
Error Glycogen degradation
Error Starch degradation
Error Taurochenodeoxycholate synthesis
Error Glycochenodeoxycholate synthesis
Error tauro-cholate synthesis
Error glyco-cholate synthesis
Error Synthesis of bilirubin
Error Glucosaminyl-acylphosphatidylinositoll to deacylated-glycophosphatidylinositol (GPI)-anchored protein
Error Biosynthesis of g3m8masn
Fail Deoxyguanosine triphosphate synthesis (dGTP)
Fail Deoxyuridine triphosphate synthesis (dUTP)
Fail Deoxythymidine triphosphate synthesis (dTTP)
Fail 3-Phospho-5-adenylyl sulfate synthesis
Fail Degradation of guanine to urate
Fail Conversion of 1-phosphatidyl-1D-myo-inositol 4,5-bisphosphate to 1D-myo-inositol 1,4,5-trisphosphate
Fail Arginine synthesis
Fail Aspartate synthesis
Fail Synthesis of taurine from cysteine
Fail Glutamate synthesis
Fail Glutamine synthesis
Fail Glycine synthesis
Fail Conversion of lysine to L-2-Aminoadipate
Fail Methionine degradation
Fail Tyrosine synthesis (need phenylalanine)
Fail Triacylglycerol synthesis
Fail Synthesis of palmitoyl-CoA

cherkaos avatar Oct 06 '21 12:10 cherkaos

Friends, I made a quick function to check the tasks by constraining only input and output fluxes by creating temporary exchange reactions involving only the metabolites in either inputs or outputs of each task (not by relaxing pseudo-steady state) and I get the same results (errors and fails). As @JonathanRob had explained, both ways of checking the tasks seem to be equivalent since relaxing the pseudo-steady state constraint for a metabolite (when the system is in closed form) is equivalent to keeping the constraint, but having an unbalanced reaction. That being said, I took a closer look at task 15 as a starting point (Deoxyguanosine triphosphate synthesis (dGTP)) and relaxed the bounds of the outputs (lower 0 - upper 100). What I found is that the problem is then feasible, albeit with some different fluxes! The production of H+[c] should be constrained to 15, not 14. For Pi[c] should be 6, not 4, For PPi[c] it should be 1, not 2. With those changes that task passes! I haven't checked other tasks, but it might just be that the stoichiometry they are using is not exactly satisfied with the HUMAN-GEM.

CadavidJoseL avatar Oct 09 '21 04:10 CadavidJoseL

@CadavidJoseL excellent!

Please keep up with the good work, look forward to a thorough investigation of the failed Cellfie Consensus Tasks using your function.

haowang-bioinfo avatar Oct 09 '21 05:10 haowang-bioinfo

Using this approach of relaxing the boundaries on the outputs, the following tasks now pass:

'Deoxyguanosine triphosphate synthesis (dGTP)' 'Deoxyuridine triphosphate synthesis (dUTP)' 'Deoxythymidine triphosphate synthesis (dTTP)' 'Arginine synthesis' 'Aspartate synthesis' 'Glutamate synthesis' 'Glutamine synthesis' 'Synthesis of palmitoyl-CoA'

I am attaching an excel file with the fluxes returned by the LP in each task (highlighted in red). Bear in mind that these are likely not the only boundary values that work, and defining a proper range of boundaries for the outputs can be done by FVA, but I haven't done so yet. Just wanted to highlight that the model is indeed capable of passing these task. Will investigate the other failed tasks a bit further.

metabolicTasks_Cellfie_corrected.xlsx

CadavidJoseL avatar Oct 09 '21 15:10 CadavidJoseL

@CadavidJoseL it is a very interesting finding that by changing outbounds for some of the metabolites in the tasks, the task will pass by Human1. Do you have any idea what could be the origin of this difference between Human1 and the model that has been used in Cellfie paper? Although by this approach the tasks could be passed, but I wonder what should be the correct value among different outbound values that could potentially pass tasks? @JonathanRob do you have any though on this observation?

rasools avatar Oct 10 '21 22:10 rasools

Thank you for all of your help on this, @cherkaos, @Rasools, and @CadavidJoseL.

The tightly defined flux bounds on the tasks is a bit more constrained that the typical tasks that I have used, since it will enforce very specific flux ratios on the model. However, as long as the bounds are justified, then this is OK, but be aware that setting such rigid bounds can sometimes results in numerical errors (i.e., a problem should be solvable, but rounding errors result in an infeasibility). Generally, I would recommend relaxing the input/output bounds for tasks where the flux ratios are not critical to the definition of the task (e.g., allow a range of fluxes, or even just set the minimum flux values).

Regarding some of the differences highlighted by @CadavidJoseL in terms of flux bounds that had to be changed for Human-GEM, it seems that many of these involve reactants/products related to ATP phosphorylation and hydrolysis. Since Human-GEM uses some different coefficients in ATP synthesis and proton pumping reactions compared to other models in order to satisfy the overall energy balance of this process, it may result in slightly different production/consumption of some related metabolites (e.g., protons, phosphate, etc.). Maybe the source of the discrepancy is elsewhere in the model, but this is my first guess.

JonathanRob avatar Oct 14 '21 05:10 JonathanRob

random thoughts: might be good to convert Metabolic Tasks files from excel to other plain text format (e.g. tsv).

haowang-bioinfo avatar Oct 14 '21 11:10 haowang-bioinfo

thanks @JonathanRob for clarifications on this.

rasools avatar Oct 14 '21 12:10 rasools