biolink-model icon indicating copy to clipboard operation
biolink-model copied to clipboard

conflate drugs and small molecules, genes and genomes

Open sierra-moxon opened this issue 3 years ago • 2 comments

Service Provider / BTE uses different mappings from biolink-model, so the data integrates with other sources better (multi-hop queries):

STY:T195 (antb, full name: Antibiotic) -> We map it to SmallMolecule (rather than Drug) STY:T121 (phsu, full name: Pharmacologic Substance) -> We map it to SmallMolecule (rather than Drug) STY:T028 (gngm, full name: Gene or Genome) -> We map it to Gene (rather than GenomicEntity) STY:T114 (nnon, full name: Nucleic Acid, Nucleoside, or Nucleotide) -> We map it to SmallMolecule (rather than NucleicAcidEntity) STY:T127 (vita, full name: Vitamin) -> We map it to SmallMolecule (rather than Vitamin)

(via #935)

sierra-moxon avatar May 10 '22 23:05 sierra-moxon

Can someone clarify what we mean when we talk about 'conflation' in Translator? My understanding is that conflation as we have been debating in Translator occurs when two entities/nodes in a KG are collapsed into one entity/node in a graph, because they are 'equivalent enough' for a particular use case. The classic example is genes and protein - when this particular conflation is 'turned on', NCBIGene:672 (BRCA1 gene) and Uniprot:P38398 (BRCA1 protein) get represented as a single node in a KG, whose IRI is the one from the preferred namespace for the priority entity type (e.g. if the preferred entity type is gene > protein, and preferred namespace for gene identifiers according to the node normalizer is hgnc, then this 'collapsed' node gets the IRI HGNC:1100). And all edges and properties owned by these once separate node are inherited by the new collapsed node. Do I have this right?

If so, then what is described above doesn't sound like conflation in this sense to me, as there is no collapsing/merging of nodes in a KG. Rather, the issue is that entities/nodes representing a genome might get categorized as a biolink:Gene. But we are not talking about smushing two nodes together into one node, correct? Maybe this is an example of conflation in a more general sense ({'little c' conflation). And the specific form of conflation I describe above would be a special form/manifestation that is of particular interest in Translator ('big C' Conflation)? @sierra-moxon @cbizon Please advise.

mbrush avatar May 19 '22 01:05 mbrush

If I understand correctly, then I agree @mbrush. Thus just seems like a difference of opinion about the way things should be mapped from UMLS to biolink?

cbizon avatar May 19 '22 13:05 cbizon