Human-GEM icon indicating copy to clipboard operation
Human-GEM copied to clipboard

Improved incorporation of enzyme complex data into GPRs

Open JonathanRob opened this issue 7 years ago • 11 comments

Description of the issue:

HMR2 GPRs were updated with enzyme complex information (i.e., "AND" expressions) using a process that combined importing rules from other models (iHsa and Recon3D) with incorporation of complex information from the CORUM database. This approach should be refined and extended in future iterations to proceed as follows:

  1. Extract all enzyme complexes from previous models (iHsa/Recon3D)
  2. Merge/combine these complexes with those from CORUM and other such databases.
  3. Remove all "AND" expressions from grRules.
  4. Re-incorporate the combined complex data into grRules.
  5. Curate the new grRules to verify that each enzyme complex is supported by literature and is associated with the proper reaction.

Expected feature/value/output:

More accurate GPRs that are supported by literature, rather than just imported from previous model versions.

I hereby confirm that I have:

  • [X] Checked that a similar issue does not exist already

JonathanRob avatar Sep 26 '18 13:09 JonathanRob

Some ideas:

  • Steps 1 and 2 appear to be bottleneck and challenging, would be good to have a practical plan for implementing them.
  • Another consideration is how to seamlessly incorporate previously work into this approach.

haowang-bioinfo avatar Sep 27 '18 10:09 haowang-bioinfo

AND rules can be curated manually using Uniprot Subunit Structure information where components of protein complexes are given.

pinarkocabas avatar Aug 06 '20 11:08 pinarkocabas

@pinarkocabas sounds good. If you have some example grRules with refined AND relations after incorporating Uniprot Subunit Structure?

haowang-bioinfo avatar Aug 06 '20 11:08 haowang-bioinfo

A recent publication that may be useful for this effort: https://www.embopress.org/doi/full/10.15252/msb.202010016

JonathanRob avatar May 11 '21 15:05 JonathanRob

@JonathanRob this is quite relevant indeed.

haowang-bioinfo avatar May 12 '21 05:05 haowang-bioinfo

Among the 6965 protein complexes collected in huMAP 2.0, a total 443 complexes were found that all the subunits are included in Human-GEM, according to their UniProt ids. There are even 5 complexes that had already been included in Human-GEM:

HuMAP2_ID Confidence Uniprot_ACCs genename Ensembl ids reaction id
HuMAP2_00663 2 Q14181;P09884 POLA2;POLA1 ENSG00000014138 and ENSG00000101868 HMR_6410;HMR_8746
HuMAP2_04771 1 P08559;P11177 PDHA1;PDHB ENSG00000131828 and ENSG00000168291 HMR_4000;HMR_4022
HuMAP2_05081 4 Q02153;Q02108 GUCY1B1;GUCY1A1 ENSG00000061918 and ENSG00000164116 HMR_9578
HuMAP2_05962 1 P30153;P67775 PPP2R1A;PPP2CA ENSG00000105568 and ENSG00000113575 HMR_9491
HuMAP2_05199 2 Q9UBE0;Q9UBT2 SAE1;UBA2 ENSG00000126261 and ENSG00000142230 HMR_7160

There are also 259 huMAP complexes that have subunits overlap with existing Human-GEM complexes. However, the catalytic association is unclear in that huMAP does not provide links to any enzymatic reactions (this has been reported as an issue).

haowang-bioinfo avatar May 12 '21 19:05 haowang-bioinfo

Some additional complex that are potentially related between huMAP2 and Human-GEM:

HuMAP2_ID Human-GEM complex
HuMAP2_05290 ENSG00000121879 and ENSG00000145675
HuMAP2_02434 ENSG00000088305 and ENSG00000130816
HuMAP2_03050 ENSG00000163541 and ENSG00000172340
HuMAP2_01188 ENSG00000091140 and ENSG00000150768 and ENSG00000163114 and ENSG00000168291
HuMAP2_06086 ENSG00000091140 and ENSG00000150768 and ENSG00000163114 and ENSG00000168291
HuMAP2_02405 ENSG00000163114 and ENSG00000168291
HuMAP2_03036 ENSG00000132155 and ENSG00000157764
HuMAP2_01652 ENSG00000107854 and ENSG00000173273
HuMAP2_03050 ENSG00000136143 and ENSG00000163541
HuMAP2_06828 ENSG00000070770 and ENSG00000101266
HuMAP2_06086 ENSG00000091140 and ENSG00000131828 and ENSG00000150768 and ENSG00000168291
HuMAP2_03004 ENSG00000126261 and ENSG00000142230
HuMAP2_05459 ENSG00000126261 and ENSG00000142230
HuMAP2_01225 ENSG00000014138 and ENSG00000101868
HuMAP2_03036 ENSG00000126934 and ENSG00000169032
HuMAP2_05209 ENSG00000062822 and ENSG00000077514 and ENSG00000106628 and ENSG00000175482
HuMAP2_01375 ENSG00000111716 and ENSG00000134333
HuMAP2_04197 ENSG00000111716 and ENSG00000134333

haowang-bioinfo avatar May 13 '21 09:05 haowang-bioinfo

A summary of the findings so far:

443 complexes were found that all the subunits are included in Human-GEM, according to their UniProt ids

5 complexes that had already been included in Human-GEM

259 huMAP complexes that have subunits overlap with existing Human-GEM complexes [without] links to any enzymatic reactions

There are also another 18

additional complex that are potentially related between huMAP2 and Human-GEM:

@Hao-Chalmers what do you mean by "potentially related", and how do these relate to the previously identified ones?

mihai-sysbio avatar May 14 '21 08:05 mihai-sysbio

@Hao-Chalmers what do you mean by "potentially related", and how do these relate to the previously identified ones?

For the 18 "potentially related" pairs, they differ only in subunit, i.e. huMAP2 subunit number = HumanGEM subunit number + 1, and the rest subunits are the same between the pair. These 18 complexes are a subset of of the 259.

haowang-bioinfo avatar May 14 '21 08:05 haowang-bioinfo

@feiranl found this https://www.biorxiv.org/content/10.1101/2021.02.28.433152v1, which could also come in handy, but it needs to be run on the cluster. The (hypothetical) advantage is that it would require less manual work.

mihai-sysbio avatar Jul 09 '21 10:07 mihai-sysbio

Working on this right now.

feiranl avatar Aug 22 '23 10:08 feiranl