RTX icon indicating copy to clipboard operation
RTX copied to clipboard

Should Node Synonymizer have concepts not in KG2?

Open edeutsch opened this issue 2 years ago • 2 comments

Our Node Synonymizer has historically (and the current version I think is the same?) only included terms that are in RTX-KG2. Since RTX-KG2 does not include the term 'fentanyl overdose': https://arax.ncats.io/test/?term=fentanyl%20overdose

But other sources like BTE have information about this term (I think?) So in principle ARAX could be able to answer queries about this (just not using KTX-KG2). I'm wondering if it would be good to include all nodes in node normalizer in node synonymizer, even if not present in KG2.

For context, see https://github.com/NCATSTranslator/Feedback/issues/246

edeutsch avatar Jun 06 '23 14:06 edeutsch

I think ideally we would - I started that way with the new synonymizer but then opted not to due to the sheer size of the SRI Node Normalizer - something like 600 million nodes, 200 million edges. Would require a pretty large instance to build.

I think a temporary solution for this might be falling back to querying the SRI Node Normalizer API on the fly for identifiers the NodeSynonymizer doesn't recognize.

amykglen avatar Jun 06 '23 15:06 amykglen

Let's put this off for a few months until Amy returns.

edeutsch avatar Jun 14 '23 18:06 edeutsch