phenopacket-schema icon indicating copy to clipboard operation
phenopacket-schema copied to clipboard

OntologyClass for ICD-O?

Open mbaudis opened this issue 3 years ago • 8 comments

The Disease object references OntologyClass for ICD-O values used for primary_site annotations.

Which ICD-O ontology implementation does this refer to? While the codes themselves are well defined/known, It is no clear which (public) source for CURIEs would be recommended (in fact, exists).

mbaudis avatar Jun 03 '21 09:06 mbaudis

Thanks for pointing this out. The documentation in the protobuf file needs to be revised to say "such as" -- in no place do we want to proscribe a specific ontology. I agree it is hard to access ICD-O terms, but this is one option (https://apps.who.int/iris/handle/10665/96612). It would also be acceptable to use UBERON or NCIT terms here. @julesjacobsen

pnrobinson avatar Jun 03 '21 10:06 pnrobinson

@pnrobinson We have coded to ICD-O for ~20ys; and my original interest in ontologies w/ CURIEs started from the lack of those for ICD-O, when writing the OntologyTerm use into the GAGH metadata schema.

There is a representation in SNOMED, but this isn't correct & also not OA. So ICD-O became the driver for us to code the ICD-O (morphology + topography) doublets to NCIt while using a modified code representation:

We've worked on getting this into MONDO last year (w/ @cmungall and @nicolevasilevsky).

Still, for practical purposes (i.e. talking to pathologists ...) we still code ICD-O and NCIt in parallel.

mbaudis avatar Jun 03 '21 11:06 mbaudis

@pnrobinson @julesjacobsen As much as I love ICD-O, I would drop it here since it requires coding of 2 arms & does not exist in a "ontologized" form. You can document/point out that when using standards like the current ICD-O the codes should be converted to a suitable ontologyClass.

mbaudis avatar Jun 04 '21 13:06 mbaudis

@pnrobinson @mbaudis Can you recommend an NCIT root term for this? NCIT:C12219?

julesjacobsen avatar Jun 09 '21 10:06 julesjacobsen

@julesjacobsen For Cancer it is NCIT:C3262 (I don't have to look this up :-)

I.e. "Neoplasm" root term, which covers also benign neoplasms.

mbaudis avatar Jun 09 '21 10:06 mbaudis

Thank @mbaudis. However, isn't Neoplasm better placed in the Biosample.histological diagnosis? The Disease.primary_site ought to be an anatomy term. In the case below this should be cervix uteri - NCIT:C12311 == UBERON:0000002

[
  {
    "id": "NCIT:C4028",
    "label": "Cervical Squamous Cell Carcinoma, Not Otherwise Specified"
  },
  {
      "id": "icdom-80703",
      "label": "Squamous cell carcinoma, NOS"
  },
  {
      "id": "icdot-C53.9",
      "label": "cervix uteri"
  }
],
[
  {
      "id": "NCIT:C4029",
      "label": "Cervical Adenocarcinoma"
  },
  {
      "id": "icdom-81403",
      "label": "Adenocarcinoma, NOS"
  },
  {
      "id": "icdot-C53.9",
      "label": "cervix uteri"
  }
]

julesjacobsen avatar Jun 09 '21 16:06 julesjacobsen

@julesjacobsen Correct - my original comment led then a to a more general drift... We're using in parallel 1x NCIt neoplasm <=> 2x ICD-O, and recode ICD-O Topo to UBERON. So, yes, NCIt Neoplasm subtree for Biosample.histological_diagnosis (we actually use it this way), and primary_site UBERON or a corresponding code.

Just for emphasis: For the cancer use, I still think that the ICD-O Topo coding is in principle the best match and has a widespread use. It is just that I'm not aware of a representation in a well structured ontology with CURIEs, to point to. This may (have) change(d) - I would be glad ...

mbaudis avatar Jun 09 '21 19:06 mbaudis

@balhoff can you comment

mellybelly avatar Jun 09 '21 19:06 mellybelly