RTX icon indicating copy to clipboard operation
RTX copied to clipboard

DTD inferred treats with 1 result and no support_graph

Open edeutsch opened this issue 2 years ago • 6 comments

https://arax.ncats.io/devED/?r=142087

inferred treats query usually return 10 results, but here only 1 is returned and also the treats edge does not have a support_graph?

Came in via Feedback: https://github.com/NCATSTranslator/Feedback/issues/302

Ideas on what's going on here? @chunyuma or anyone else?

edeutsch avatar Jun 07 '23 14:06 edeutsch

I found another example of this: https://arax.ncats.io/beta/?r=142236

{
  "edges": {
    "e0": {
      "attribute_constraints": [],
      "exclude": false,
      "object": "n1",
      "predicates": [
        "biolink:treats"
      ],
      "qualifier_constraints": [],
      "subject": "n0",
"knowledge_type": "inferred"
    }
  },
  "nodes": {
    "n0": {
      "constraints": [],
      "ids": [
        "MONDO:0009903"
      ],
      "is_set": false
    },
    "n1": {
      "categories": [
        "biolink:ChemicalEntity"
      ],
      "constraints": [],
      "is_set": false
    }
  }
}

dkoslicki avatar Jun 07 '23 20:06 dkoslicki

just FYI - it looks like this is not a bug in the ResultTransformer, because the knowledge graph for David's query above already only contains two nodes and two edges when ResultTransformer is called

amykglen avatar Jun 07 '23 23:06 amykglen

Hi @edeutsch, @amykglen or @dkoslicki, could you please help me take a look at two places (here and here) in "infer_utilities.py".

I found that previously @dkoslicki wrote these code and tried to use a shorthand kedges to represent the long variable response.envelope.message.knowledge_graph.edges. However, when I checked their IDs to see whether they locate the same memory, I found that the memory location of response.envelope.message.knowledge_graph.edges keeps changing.

Please see their IDs:

response.envelope.message.knowledge_graph.edges id: 140462370426240
message.knowledge_graph.edges id 140462370426240
kedges id 140462370429312
######################
response.envelope.message.knowledge_graph.edges id: 140462341671040
message.knowledge_graph.edges id 140462341671040
kedges id 140462370429312
######################
response.envelope.message.knowledge_graph.edges id: 140462341676416
message.knowledge_graph.edges id 140462341676416
kedges id 140462370429312
######################
response.envelope.message.knowledge_graph.edges id: 140462341680960
message.knowledge_graph.edges id 140462341680960
kedges id 140462370429312
######################
response.envelope.message.knowledge_graph.edges id: 140462341683968
message.knowledge_graph.edges id 140462341683968
kedges id 140462370429312
######################
response.envelope.message.knowledge_graph.edges id: 140462343907584
message.knowledge_graph.edges id 140462343907584
kedges id 140462370429312
######################
response.envelope.message.knowledge_graph.edges id: 140462343908352
message.knowledge_graph.edges id 140462343908352
kedges id 140462370429312
######################
response.envelope.message.knowledge_graph.edges id: 140462351358528
message.knowledge_graph.edges id 140462351358528
kedges id 140462370429312
######################
response.envelope.message.knowledge_graph.edges id: 140462351360192
message.knowledge_graph.edges id 140462351360192
kedges id 140462370429312
######################
response.envelope.message.knowledge_graph.edges id: 140462351366272
message.knowledge_graph.edges id 140462351366272
kedges id 140462370429312
######################
response.envelope.message.knowledge_graph.edges id: 140462335349056
message.knowledge_graph.edges id 140462335349056
kedges id 140462370429312

I'm curious why this happens. Is there any mechanism within the class Response that causes the ID change?

I think this is the main reason why the xDTD inferred treats has only 1 result.

chunyuma avatar Jun 14 '23 23:06 chunyuma

I am puzzled, too, although I am not really certain where these print statements are and what the code path is. I stared at it for a while and don't see an answer.

edeutsch avatar Jun 15 '23 02:06 edeutsch

No worries, @edeutsch. I added a few lines to print these information out for checking. So you might not see these print statements. Although it is weird, I think I can just replace the kedges with message.knowledge_graph.edges to solve this issue. Thanks!

chunyuma avatar Jun 15 '23 16:06 chunyuma

ok, I fixed that id-change problem. And now it should return the results correctly. Note that for some drug-disease pairs, they may not have any support graphs probably because 1) KG2 essentially lacks a path connecting to them; 2) some KG2 paths were filtered out due to some filtering thresholds (e.g., # Pubmid, NGD); 3) too generic or not biologically reasonable paths which contains 'biolink:related_to', 'biolink:coexists_with', 'biolink:contraindicated_for'.

We need to verify this bug has been fixed after rolling out to /dev or /test.

chunyuma avatar Jun 16 '23 16:06 chunyuma

close it as I have verified it has been resolved in https://arax.rtx.ai/.

chunyuma avatar Jun 18 '24 00:06 chunyuma