RTX icon indicating copy to clipboard operation
RTX copied to clipboard

Improve synonymization of Technetium Tc-99m Albumin Aggregated

Open amykglen opened this issue 4 years ago • 5 comments

as @edeutsch requested, I traced a couple instances where the local fastNGD database (#729) 'misses' a concept but eUtils doesn't - this is the write-up for one example: NCIT:C87398 (Technetium Tc-99m Albumin Aggregated).

  - 2020-08-25 15:36:43.838157 DEBUG: Had to use eUtils to compute NGD between renal cell carcinoma (MONDO:0005086) and Technetium Tc-99m Albumin Aggregated (NCIT:C87398)(value is: 0.7492777150971099)

from kg2canonicalized:

n.id n.name n.equivalent_curies
"NCIT:C87398" "Technetium Tc-99m Albumin Aggregated" ["NCIT:C87398", "CHEMBL.COMPOUND:CHEMBL1201522"]

so in this case, we can see there's no MESH curie in the equivalent_curies, and I confirmed that neither of the equivalent nodes nor their attached edges have publications listed in KG2, so it's not surprising fastNGD isn't aware of any PMIDs for this node.

but what is interesting is that there is a MESH node in KG2 named "Technetium Tc 99m Aggregated Albumin" (word order is slightly different):

n.id n.name n.equivalent_curies
"MESH:D013668" "Technetium Tc 99m Aggregated Albumin" ["UMLS:C0740185", "UMLS:C0087067", "UMLS:C0087068", "MESH:D013668"]

and there are definitely PubMed articles associated with MESH term D013668, so if NodeSynonymizer synonymized these two concepts, then the fastNGD system would no longer 'miss' NCIT:C87398/CHEMBL.COMPOUND:CHEMBL1201522.

amykglen avatar Aug 31 '20 23:08 amykglen