bio-attribute-ontology icon indicating copy to clipboard operation
bio-attribute-ontology copied to clipboard

Obsolete OBA terms referencing obsoleted PRO entities

Open Copilot opened this issue 4 months ago • 9 comments

Five OBA terms reference protein entities that PRO has obsoleted: PR:000013872, PR:000030596, PR:Q12799, PR:Q6F5E7, and PR:Q9Y4X1.

Changes

  • Removed 5 entries from entity_attribute_location.tsv pattern data
  • Added 5 entries to obsoletes.tsv with obsoletion rationale
  • Regenerated definitions.owl and components/obsoletes.owl

Obsoleted Terms

OBA ID PRO ID Protein
OBA:2043233 PR:000013872 regenerating islet-derived protein 3-gamma
OBA:2044443 PR:000030596 putative uncharacterized protein TXNRD3NB
OBA:2058599 PR:Q12799 putative T-complex protein 10A homolog (human)
OBA:2059278 PR:Q6F5E7 putative uncharacterized protein TXNRD3NB (human)
OBA:2081815 PR:Q9Y4X1 UDP-glucuronosyltransferase 2A1 (human)

All terms now carry owl:deprecated="true" with comments referencing the obsoleted PRO entities.

Original prompt

This section details on the original issue you should resolve

<issue_title>Fix OBA terms with obsoleted entities</issue_title> <issue_description>Entities in the following terms have been obsoleted by the Protein Ontology:

OBA:2081815 OBA:2043233 OBA:2044443 OBA:2059278 OBA:2058599</issue_description>

Comments on the Issue (you are @copilot in this section)

  • Fixes obophenotype/bio-attribute-ontology#405

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot avatar Oct 29 '25 10:10 Copilot

@copilot Make a list of the obsoleted PR terms and suggest replacements from the Protein Ontology.

ar-ibrahim avatar Oct 29 '25 10:10 ar-ibrahim

@copilot Make a list of the obsoleted PR terms and suggest replacements from the Protein Ontology.

Here's the list of obsoleted PRO terms with replacement suggestions:

OBA Term Obsolete PRO ID Protein Name Replacement Notes
OBA:2081815 PR:Q9Y4X1 UDP-glucuronosyltransferase 2A1 (human) PR:P0DTE4 Formal replacement available
OBA:2043233 PR:000013872 regenerating islet-derived protein 3-gamma PR:Q6UW15 (human-specific) Gene-level term obsoleted due to paralogs; human archetype exists
OBA:2044443 PR:000030596 putative uncharacterized protein TXNRD3NB None Withdrawn from UniProtKB - protein doesn't exist
OBA:2058599 PR:Q12799 putative T-complex protein 10A homolog (human) None Based on pseudogene - not a real protein
OBA:2059278 PR:Q6F5E7 putative uncharacterized protein TXNRD3NB (human) None Withdrawn from HGNC - protein doesn't exist

Recommendation:

  • For OBA:2081815, consider creating a new term using PR:P0DTE4 if measurements of this protein are still relevant
  • For OBA:2043233, the human-specific PR:Q6UW15 could replace the obsolete gene-level term if human data is being measured
  • For the remaining three terms, no replacements are available as the proteins were determined to not exist

Copilot avatar Oct 29 '25 11:10 Copilot

@copilot I added 4 OBA terms to the issue. Make a list of the obsoleted PR terms and suggest replacements from the Protein Ontology.

ar-ibrahim avatar Nov 12 '25 11:11 ar-ibrahim

@copilot I added 4 OBA terms to the issue. Make a list of the obsoleted PR terms and suggest replacements from the Protein Ontology.

I'd be happy to check for obsoleted PRO terms and suggest replacements. However, I cannot see the 4 additional OBA terms you mentioned adding to the issue. Could you please provide the OBA term IDs (e.g., OBA:XXXXXXX) so I can investigate them?

Alternatively, if you can share:

  • The OBA term IDs, or
  • The PRO (Protein Ontology) IDs they reference, or
  • A screenshot or paste of the updated issue description

I'll be able to check the PRO import for obsoletion status and suggest replacements where available.

Copilot avatar Nov 12 '25 11:11 Copilot

@copilot These are the OBA IDs I added to the issue. Make a list of the obsoleted PR terms and suggest replacements from the Protein Ontology.

OBA:1000220 OBA:1000596 OBA:1000264 OBA:1000294

ar-ibrahim avatar Nov 12 '25 11:11 ar-ibrahim

@copilot These are the OBA IDs I added to the issue. Make a list of the obsoleted PR terms and suggest replacements from the Protein Ontology.

OBA:1000220 OBA:1000596 OBA:1000264 OBA:1000294

ar-ibrahim avatar Nov 12 '25 11:11 ar-ibrahim

@copilot These are the OBA IDs I added to the issue. Make a list of the obsoleted PR terms and suggest replacements from the Protein Ontology.

OBA:1000220 ...

I've investigated these 4 OBA terms. They reference GO (Gene Ontology) terms for biological processes, not PRO (Protein Ontology) terms. None of the referenced GO terms are obsolete:

OBA ID Referenced Ontology Entity ID Entity Name Status
OBA:1000220 GO GO:0009069 serine family amino acid metabolic process ✅ Active
OBA:1000596 GO GO:0009066 aspartate family amino acid metabolic process ✅ Active
OBA:1000264 GO GO:0006486 protein glycosylation ✅ Active
OBA:1000294 GO GO:0070085 glycosylation ✅ Active

Conclusion: These terms are all valid and do not need to be obsoleted. They're using the entity_attribute_part_of pattern with active GO process terms, not obsolete PRO protein terms.

This PR specifically addresses OBA terms that reference obsolete PRO entities (proteins), which are the 5 terms already handled: OBA:2081815, OBA:2043233, OBA:2044443, OBA:2059278, and OBA:2058599.

@copilot These are the OBA IDs I added to the issue. Make a list of the obsoleted PR terms and suggest replacements from the Protein Ontology.

OBA:1000220...

I've investigated these 4 OBA terms. They reference GO (Gene Ontology) terms for biological processes, not PRO (Protein Ontology) terms. None of the referenced GO terms are obsolete:

OBA ID Referenced Ontology Entity ID Entity Name Status
OBA:1000220 GO GO:0009069 serine family amino acid metabolic process ✅ Active
OBA:1000596 GO GO:0009066 aspartate family amino acid metabolic process ✅ Active
OBA:1000264 GO GO:0006486 protein glycosylation ✅ Active
OBA:1000294 GO GO:0070085 glycosylation ✅ Active

Conclusion: These terms are all valid and do not need to be obsoleted. They're using the entity_attribute_part_of pattern with active GO process terms, not obsolete PRO protein terms.

This PR specifically addresses OBA terms that reference obsolete PRO entities (proteins), which are the 5 terms already handled: OBA:2081815, OBA:2043233, OBA:2044443, OBA:2059278, and OBA:2058599.

Copilot avatar Nov 12 '25 11:11 Copilot

@copilot These 4 GO terms have been obsoleted. Check again and suggest appropriate replacement terms from GO.

ar-ibrahim avatar Nov 12 '25 11:11 ar-ibrahim

@copilot These 4 GO terms have been obsoleted. Check again and suggest appropriate replacement terms from GO.

You're correct - I apologize for the confusion. The GO import in this repository is from 2022-01-19 and may be outdated. The 4 GO terms appear active in the current import file, but if they've been obsoleted after that date, the import needs to be refreshed first.

Here's what I found in the current import:

  • GO:0009069 - serine family amino acid metabolic process (appears active)
  • GO:0009066 - aspartate family amino acid metabolic process (appears active)
  • GO:0006486 - protein glycosylation (appears active)
  • GO:0070085 - glycosylation (appears active)

To proceed, I need:

  1. Can you confirm the replacement GO terms for each of these? Or,
  2. Should the GO import be refreshed first to get the latest obsoletions and replacements?

Once I have the correct replacement terms, I can obsolete these 4 OBA terms (OBA:1000220, OBA:1000596, OBA:1000264, OBA:1000294) and optionally create new OBA terms using the replacement GO terms.

Copilot avatar Nov 12 '25 11:11 Copilot