Human-GEM
Human-GEM copied to clipboard
Improved incorporation of enzyme complex data into GPRs
Description of the issue:
HMR2 GPRs were updated with enzyme complex information (i.e., "AND" expressions) using a process that combined importing rules from other models (iHsa and Recon3D) with incorporation of complex information from the CORUM database. This approach should be refined and extended in future iterations to proceed as follows:
- Extract all enzyme complexes from previous models (iHsa/Recon3D)
- Merge/combine these complexes with those from CORUM and other such databases.
- Remove all "AND" expressions from grRules.
- Re-incorporate the combined complex data into grRules.
- Curate the new grRules to verify that each enzyme complex is supported by literature and is associated with the proper reaction.
Expected feature/value/output:
More accurate GPRs that are supported by literature, rather than just imported from previous model versions.
I hereby confirm that I have:
- [X] Checked that a similar issue does not exist already
Some ideas:
- Steps 1 and 2 appear to be bottleneck and challenging, would be good to have a practical plan for implementing them.
- Another consideration is how to seamlessly incorporate previously work into this approach.
AND rules can be curated manually using Uniprot Subunit Structure information where components of protein complexes are given.
@pinarkocabas sounds good. If you have some example grRules with refined AND relations after incorporating Uniprot Subunit Structure?
A recent publication that may be useful for this effort: https://www.embopress.org/doi/full/10.15252/msb.202010016
@JonathanRob this is quite relevant indeed.
Among the 6965 protein complexes collected in huMAP 2.0, a total 443 complexes were found that all the subunits are included in Human-GEM, according to their UniProt ids. There are even 5 complexes that had already been included in Human-GEM:
| HuMAP2_ID | Confidence | Uniprot_ACCs | genename | Ensembl ids | reaction id |
|---|---|---|---|---|---|
| HuMAP2_00663 | 2 | Q14181;P09884 | POLA2;POLA1 | ENSG00000014138 and ENSG00000101868 | HMR_6410;HMR_8746 |
| HuMAP2_04771 | 1 | P08559;P11177 | PDHA1;PDHB | ENSG00000131828 and ENSG00000168291 | HMR_4000;HMR_4022 |
| HuMAP2_05081 | 4 | Q02153;Q02108 | GUCY1B1;GUCY1A1 | ENSG00000061918 and ENSG00000164116 | HMR_9578 |
| HuMAP2_05962 | 1 | P30153;P67775 | PPP2R1A;PPP2CA | ENSG00000105568 and ENSG00000113575 | HMR_9491 |
| HuMAP2_05199 | 2 | Q9UBE0;Q9UBT2 | SAE1;UBA2 | ENSG00000126261 and ENSG00000142230 | HMR_7160 |
There are also 259 huMAP complexes that have subunits overlap with existing Human-GEM complexes. However, the catalytic association is unclear in that huMAP does not provide links to any enzymatic reactions (this has been reported as an issue).
Some additional complex that are potentially related between huMAP2 and Human-GEM:
| HuMAP2_ID | Human-GEM complex |
|---|---|
| HuMAP2_05290 | ENSG00000121879 and ENSG00000145675 |
| HuMAP2_02434 | ENSG00000088305 and ENSG00000130816 |
| HuMAP2_03050 | ENSG00000163541 and ENSG00000172340 |
| HuMAP2_01188 | ENSG00000091140 and ENSG00000150768 and ENSG00000163114 and ENSG00000168291 |
| HuMAP2_06086 | ENSG00000091140 and ENSG00000150768 and ENSG00000163114 and ENSG00000168291 |
| HuMAP2_02405 | ENSG00000163114 and ENSG00000168291 |
| HuMAP2_03036 | ENSG00000132155 and ENSG00000157764 |
| HuMAP2_01652 | ENSG00000107854 and ENSG00000173273 |
| HuMAP2_03050 | ENSG00000136143 and ENSG00000163541 |
| HuMAP2_06828 | ENSG00000070770 and ENSG00000101266 |
| HuMAP2_06086 | ENSG00000091140 and ENSG00000131828 and ENSG00000150768 and ENSG00000168291 |
| HuMAP2_03004 | ENSG00000126261 and ENSG00000142230 |
| HuMAP2_05459 | ENSG00000126261 and ENSG00000142230 |
| HuMAP2_01225 | ENSG00000014138 and ENSG00000101868 |
| HuMAP2_03036 | ENSG00000126934 and ENSG00000169032 |
| HuMAP2_05209 | ENSG00000062822 and ENSG00000077514 and ENSG00000106628 and ENSG00000175482 |
| HuMAP2_01375 | ENSG00000111716 and ENSG00000134333 |
| HuMAP2_04197 | ENSG00000111716 and ENSG00000134333 |
A summary of the findings so far:
443 complexes were found that all the subunits are included in Human-GEM, according to their UniProt ids
5 complexes that had already been included in Human-GEM
259 huMAP complexes that have subunits overlap with existing Human-GEM complexes [without] links to any enzymatic reactions
There are also another 18
additional complex that are potentially related between huMAP2 and Human-GEM:
@Hao-Chalmers what do you mean by "potentially related", and how do these relate to the previously identified ones?
@Hao-Chalmers what do you mean by "potentially related", and how do these relate to the previously identified ones?
For the 18 "potentially related" pairs, they differ only in subunit, i.e. huMAP2 subunit number = HumanGEM subunit number + 1, and the rest subunits are the same between the pair. These 18 complexes are a subset of of the 259.
@feiranl found this https://www.biorxiv.org/content/10.1101/2021.02.28.433152v1, which could also come in handy, but it needs to be run on the cluster. The (hypothetical) advantage is that it would require less manual work.
Working on this right now.