go-site
go-site copied to clipboard
MGI GAF includes PRO isoforms as the main annotatable object
✗ getgaf mgi | egrep '\tprotein\t' | tail
PR Q9Z2D6-2 mMECP2/iso:2 located_in GO:0005634 PMID:18334558 IDA C methyl-CpG-binding protein 2 isoform 2 (mouse) mMECP2/iso:2|MECP2b (mouse)|MECP2e1 (mouse) protein taxon:10090 20120126MGI
PR Q9Z2D6-1 mMECP2/iso:1 located_in GO:0005634 PMID:18334558 IDA C methyl-CpG-binding protein 2 isoform 1 (mouse) mMECP2/iso:1|MECP2a (mouse)|MECP2e2 (mouse) protein taxon:10090 20101014MGI
PR Q9Z2D6-2 mMECP2/iso:2 located_in GO:0005634 PMID:15034150 IDA C methyl-CpG-binding protein 2 isoform 2 (mouse) mMECP2/iso:2|MECP2b (mouse)|MECP2e1 (mouse) protein taxon:10090 20120126MGI
PR Q9Z2D6-2 mMECP2/iso:2 acts_upstream_of_or_within GO:0006641 PMID:30137367 IMP MGI:MGI:5584016 P methyl-CpG-binding protein 2 isoform 2 (mouse) mMECP2/iso:2|MECP2b (mouse)|MECP2e1 (mouse) proteintaxon:10090 20200304 MGI
PR Q9Z2D6-1 mMECP2/iso:1 part_of GO:0000792 PMID:18334558 IDA C methyl-CpG-binding protein 2 isoform 1 (mouse) mMECP2/iso:1|MECP2a (mouse)|MECP2e2 (mouse) protein taxon:10090 20201009 MGI
PR Q9Z2D6-2 mMECP2/iso:2 enables GO:0003682 PMID:18334558 IDA F methyl-CpG-binding protein 2 isoform 2 (mouse) mMECP2/iso:2|MECP2b (mouse)|MECP2e1 (mouse) protein taxon:10090 20120126 MGI
PR Q08460-4 mKCNMA1/iso:4 enables GO:0015269 PMID:16081418 IDA F calcium-activated potassium channel subunit alpha-1 isoform 4 (mouse) mKCNMA1/iso:4|calcium-activated potassium channel subunit alpha-1 isoform STREX-1 (mouse) protein taxon:10090 20080813 MGI
PR Q08460-1 mKCNMA1/iso:1 enables GO:0015269 PMID:16081418 IDA F calcium-activated potassium channel subunit alpha-1 isoform 1 (mouse) mKCNMA1/iso:1 protein taxon:10090 20140331 MGI
PR Q08460-1 mKCNMA1/iso:1 enables GO:0005249 PMID:16081418 IDA F calcium-activated potassium channel subunit alpha-1 isoform 1 (mouse) mKCNMA1/iso:1 protein taxon:10090 20140331 MGI
PR Q08460-4 mKCNMA1/iso:4 enables GO:0005249 PMID:16081418 IDA F calcium-activated potassium channel subunit alpha-1 isoform 4 (mouse) mKCNMA1/iso:4|calcium-activated potassium channel subunit alpha-1 isoform STREX-1 (mouse) protein taxon:10090 20080811 MGI
These should not be here. Instead the annotation should be rolled up to the gene (e.g. Kcnma1 in the case of Q08460), and the isoform should go in column 17
Here is an example of how it should be done:
MGI MGI:1926176 Gas2l1 located_in GO:0005737 MGI:MGI:3052497|PMID:12584248 IDA C growth arrest-specific 2 like 1 4930500E24Rik|D0Jmb1|GAR22|TU-71.1 protein_coding_gene taxon:10090 20120921UniProt part_of(CL:0000586)|part_of(CL:0000017) UniProtKB:Q8JZP9-2
Note the behavior is correct for all uniprot-sourced annotations and incorrect for MGI sourced (which us PRO).
I assume that this is a matter of the roll up code needing to deal with both PRO isoforms and UniProt isoforms. The situation is inherently confusing due to the fact that in many cases the local IDs are the same (e.g. Q08460-4) yet the actual prefixed ID is arbitrarily different