RTX icon indicating copy to clipboard operation
RTX copied to clipboard

subclass reasoning/inference for UBERON

Open dkoslicki opened this issue 3 years ago • 3 comments

From the Relay, it appears RTX-KG2 is not doing subclass inference for UBERON

dkoslicki avatar Jun 06 '22 19:06 dkoslicki

I opened a Cypher session on kg2canonicalized.rtx.ai, which contains KG2.7.5c, and ran the following Cypher query:

match (n)-[r:`biolink:subclass_of`]->(m) where n.id =~ 'UBERON:.*' and m.id =~ 'UBERON:.*' return count(*);

and it returned 24,639. So it appears that there are 24,639 UBERON-[subclass_of]->UBERON type edges in KG2.7.5c. Here is an example:

match (n)-[r:`biolink:subclass_of`]->(m) where n.id =~ 'UBERON:.*' and m.id =~ 'UBERON:.*' return n.id, r.predicate, r.knowledge_source, m.id limit 10;

returning:



n.id | r.predicate | r.knowledge_source | m.id
-- | -- | -- | --
"UBERON:0018355" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
"UBERON:0008293" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
"UBERON:0008292" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
"UBERON:0008291" | "biolink:subclass_of" | ["infores:genepio", "infores:uberon"] | "UBERON:0000022"
"UBERON:0034930" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
"UBERON:0014480" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
"UBERON:0018688" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
"UBERON:0018538" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
"UBERON:0018539" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"
"UBERON:0018537" | "biolink:subclass_of" | ["infores:uberon"] | "UBERON:0000022"

So this seems to be not an issue with KG2c, but rather, perhaps an issue with the RTX-KG2 API or PloverDB perhaps? I am tagging @amykglen in the hopes that she can weigh in. If it is RTX-KG2 API or PloverDB, in that case I would vote to transfer this issue to the RTX repo issue tracker.

saramsey avatar Jun 10 '22 22:06 saramsey

yes, this is something we need to do with Plover. when we implemented subclass_of reasoning we only did it for the more common kinds of pinned query nodes (drugs, diseases), which seemed sufficient early on, but now we need to expand that. so I agree this issue can be transferred to the RTX repo.

amykglen avatar Jun 11 '22 02:06 amykglen

I'll be addressing this soon at the same time as #1812

amykglen avatar Apr 26 '23 18:04 amykglen

try to fix for end of Sprint 6? @amykglen

edeutsch avatar Sep 25 '24 17:09 edeutsch

this is live on KG2 Plover CI! for instance, submitting this query for molecular activities related to 'exocrine gland' to kg2cploverdb.ci.transltr.io returns results involving 'exocrine gland' but also 'liver':

{
  "edges": {
    "e00": {
      "object": "n01",
      "predicates": [
        "biolink:related_to"
      ],
      "subject": "n00"
    }
  },
  "nodes": {
    "n00": {
      "ids": [
        "UBERON:0002365"
      ]
    },
    "n01": {
      "categories": [
        "biolink:MolecularActivity"
      ]
    }
  }
}

closing

amykglen avatar Oct 03 '24 02:10 amykglen