RTX icon indicating copy to clipboard operation
RTX copied to clipboard

TRAPI query with an "exclude" edge causes ARAX to run indefinitely

Open CaseyTa opened this issue 2 years ago • 5 comments

The Clinical Data Committee are working on a Curated Query Service (CQS) query that we're temporarily testing out directly on ARAX. We noticed that when running the full "Path A" query, ARAX seems to complete all workflow operations but gets stuck when trying to transform the results to TRAPI 1.4. The ARAX Status log shows:

...
2023-07-25T22:01:33.536269 INFO: Transforming results to TRAPI 1.4 format (moving 'virtual' nodes/edges to support graphs)
2023-07-25T22:01:33.536321 DEBUG: Original input QG contained qnodes {'n4', 'n0', 'n1', 'n3', 'n2'} and qedges {'e1', 'e0', 'e4', 'e2', 'e3'}
2023-07-25T22:01:33.536346 DEBUG: Non-orphan qnodes in original QG are: {'n4', 'n1', 'n3', 'n0', 'n2'}
2023-07-25T22:04:34.014984 DEBUG: Query is still progressing...

with repeated Query is still progressing... messages. I've waited ~3 hours once, but it never finished. (I can provide more of the log if helpful)

After removing edge e3 from the query graph, ARAX completes the query as expected. When removing other edges while keeping e3, ARAX again hangs.

The full "Path A" query, including e3 (exclude edge):

{
  "workflow": [
    {
      "id": "fill"
    },
    {
      "id": "bind"
    },
    {
      "id": "complete_results"
    },
    {
      "id": "score"
    },
    {
      "id": "filter_results_top_n",
      "parameters": {
        "max_results": 100
      }
    }
  ],
  "message": {
    "query_graph": {
      "edges": {
        "e0": {
          "exclude": false,          
          "predicates": [
            "biolink:correlated_with",
            "biolink:associated_with_likelihood_of"
          ],
          "subject": "n0",
          "object": "n1"
        },
        "e1": {
          "exclude": false,          
          "subject": "n1",
          "object": "n2",
          "predicates": [
            "biolink:affects"
          ],
          "qualifier_constraints": [
            {
              "qualifier_set": [
                {
                  "qualifier_type_id": "biolink:object_direction_qualifier",
                  "qualifier_value": "increased"
                },
                {
                  "qualifier_type_id": "biolink:object_aspect_qualifier",
                  "qualifier_value": "activity_or_abundance"
                },
                {
                  "qualifier_type_id": "biolink:qualified_predicate",
                  "qualifier_value": "biolink:causes"
                }
              ]
            },
            {
              "qualifier_set": [
                {
                  "qualifier_type_id": "biolink:object_direction_qualifier",
                  "qualifier_value": "increased"
                },
                {
                  "qualifier_type_id": "biolink:object_aspect_qualifier",
                  "qualifier_value": "expression"
                },
                {
                  "qualifier_type_id": "biolink:qualified_predicate",
                  "qualifier_value": "biolink:causes"
                }
              ]
            },
            {
              "qualifier_set": [
                {
                  "qualifier_type_id": "biolink:object_direction_qualifier",
                  "qualifier_value": "increased"
                },
                {
                  "qualifier_type_id": "biolink:object_aspect_qualifier",
                  "qualifier_value": "secretion"
                },
                {
                  "qualifier_type_id": "biolink:qualified_predicate",
                  "qualifier_value": "biolink:causes"
                }
              ]
            },
            {
              "qualifier_set": [
                {
                  "qualifier_type_id": "biolink:object_direction_qualifier",
                  "qualifier_value": "decreased"
                },
                {
                  "qualifier_type_id": "biolink:object_aspect_qualifier",
                  "qualifier_value": "degradation"
                },
                {
                  "qualifier_type_id": "biolink:qualified_predicate",
                  "qualifier_value": "biolink:causes"
                }
              ]
            }
          ]
        },
        "e2": {
          "exclude": false,          
          "subject": "n3",
          "object": "n2",
          "predicates": [
            "biolink:affects"
          ],
          "qualifier_constraints": [
            {
              "qualifier_set": [
                {
                  "qualifier_type_id": "biolink:object_direction_qualifier",
                  "qualifier_value": "increased"
                },
                {
                  "qualifier_type_id": "biolink:object_aspect_qualifier",
                  "qualifier_value": "activity_or_abundance"
                },
                {
                  "qualifier_type_id": "biolink:qualified_predicate",
                  "qualifier_value": "biolink:causes"
                }
              ]
            },
            {
              "qualifier_set": [
                {
                  "qualifier_type_id": "biolink:object_direction_qualifier",
                  "qualifier_value": "increased"
                },
                {
                  "qualifier_type_id": "biolink:object_aspect_qualifier",
                  "qualifier_value": "expression"
                },
                {
                  "qualifier_type_id": "biolink:qualified_predicate",
                  "qualifier_value": "biolink:causes"
                }
              ]
            },
            {
              "qualifier_set": [
                {
                  "qualifier_type_id": "biolink:object_direction_qualifier",
                  "qualifier_value": "increased"
                },
                {
                  "qualifier_type_id": "biolink:object_aspect_qualifier",
                  "qualifier_value": "secretion"
                },
                {
                  "qualifier_type_id": "biolink:qualified_predicate",
                  "qualifier_value": "biolink:causes"
                }
              ]
            },
            {
              "qualifier_set": [
                {
                  "qualifier_type_id": "biolink:object_direction_qualifier",
                  "qualifier_value": "decreased"
                },
                {
                  "qualifier_type_id": "biolink:object_aspect_qualifier",
                  "qualifier_value": "degradation"
                },
                {
                  "qualifier_type_id": "biolink:qualified_predicate",
                  "qualifier_value": "biolink:causes"
                }
              ]
            }
          ]
        },
        "e3": {
          "exclude": true,
          "predicates": [
            "biolink:correlated_with",
            "biolink:associated_with_likelihood_of"
          ],
          "subject": "n3",
          "object": "n4"
        },
        "e4": {
          "exclude": false,          
          "subject": "n2",
          "object": "n0",
          "predicates": [
            "biolink:contributes_to",
            "biolink:associated_with",
            "biolink:gene_associated_with_condition"
          ]
        }
      },
      "nodes": {
        "n0": {
          "ids": [
            "MONDO:0009061"
          ],
          "is_set": false
        },
        "n1": {
          "categories": [
            "biolink:ChemicalEntity"
          ],
          "is_set": false
        },
        "n2": {
          "categories": [
            "biolink:Gene",
            "biolink:Protein"
          ],
          "is_set": false
        },
        "n3": {
          "categories": [
            "biolink:ChemicalEntity"
          ],
          "is_set": false
        },
        "n4": {
          "ids": [
            "MONDO:0009061"
          ],
          "is_set": false
        }
      }
    }
  }
}

Modified query without the e3 edge (which works):

{
    "workflow":
    [
        {
            "id": "fill",
            "parameters":
            {
                "allowlist":
                [
                    "infores:cohd",
                    "infores:automat-icees-kg"
                ],
                "qedge_keys":
                [
                    "e0"
                ]
            }
        },
        {
            "id": "fill",
            "parameters":
            {
                "allowlist": [],
                "qedge_keys":
                [
                    "e1",
                    "e2",
                    "e4"
                ]
            }
        },
        {
            "id": "bind"
        },
        {
            "id": "complete_results"
        },
        {
            "id": "score"
        },
        {
            "id": "filter_results_top_n",
            "parameters":
            {
                "max_results": 100
            }
        }
    ],
    "message":
    {
        "query_graph":
        {
            "edges":
            {
                "e0":
                {
                    "exclude": false,
                    "predicates":
                    [
                        "biolink:correlated_with",
                        "biolink:associated_with_likelihood_of"
                    ],
                    "subject": "n0",
                    "object": "n1"
                },
                "e1":
                {
                    "exclude": false,
                    "subject": "n1",
                    "object": "n2",
                    "predicates":
                    [
                        "biolink:affects"
                    ],
                    "qualifier_constraints":
                    [
                        {
                            "qualifier_set":
                            [
                                {
                                    "qualifier_type_id": "biolink:object_direction_qualifier",
                                    "qualifier_value": "increased"
                                },
                                {
                                    "qualifier_type_id": "biolink:object_aspect_qualifier",
                                    "qualifier_value": "activity_or_abundance"
                                },
                                {
                                    "qualifier_type_id": "biolink:qualified_predicate",
                                    "qualifier_value": "biolink:causes"
                                }
                            ]
                        },
                        {
                            "qualifier_set":
                            [
                                {
                                    "qualifier_type_id": "biolink:object_direction_qualifier",
                                    "qualifier_value": "increased"
                                },
                                {
                                    "qualifier_type_id": "biolink:object_aspect_qualifier",
                                    "qualifier_value": "expression"
                                },
                                {
                                    "qualifier_type_id": "biolink:qualified_predicate",
                                    "qualifier_value": "biolink:causes"
                                }
                            ]
                        },
                        {
                            "qualifier_set":
                            [
                                {
                                    "qualifier_type_id": "biolink:object_direction_qualifier",
                                    "qualifier_value": "increased"
                                },
                                {
                                    "qualifier_type_id": "biolink:object_aspect_qualifier",
                                    "qualifier_value": "secretion"
                                },
                                {
                                    "qualifier_type_id": "biolink:qualified_predicate",
                                    "qualifier_value": "biolink:causes"
                                }
                            ]
                        },
                        {
                            "qualifier_set":
                            [
                                {
                                    "qualifier_type_id": "biolink:object_direction_qualifier",
                                    "qualifier_value": "decreased"
                                },
                                {
                                    "qualifier_type_id": "biolink:object_aspect_qualifier",
                                    "qualifier_value": "degradation"
                                },
                                {
                                    "qualifier_type_id": "biolink:qualified_predicate",
                                    "qualifier_value": "biolink:causes"
                                }
                            ]
                        }
                    ]
                },
                "e2":
                {
                    "exclude": false,
                    "subject": "n3",
                    "object": "n2",
                    "predicates":
                    [
                        "biolink:affects"
                    ],
                    "qualifier_constraints":
                    [
                        {
                            "qualifier_set":
                            [
                                {
                                    "qualifier_type_id": "biolink:object_direction_qualifier",
                                    "qualifier_value": "increased"
                                },
                                {
                                    "qualifier_type_id": "biolink:object_aspect_qualifier",
                                    "qualifier_value": "activity_or_abundance"
                                },
                                {
                                    "qualifier_type_id": "biolink:qualified_predicate",
                                    "qualifier_value": "biolink:causes"
                                }
                            ]
                        },
                        {
                            "qualifier_set":
                            [
                                {
                                    "qualifier_type_id": "biolink:object_direction_qualifier",
                                    "qualifier_value": "increased"
                                },
                                {
                                    "qualifier_type_id": "biolink:object_aspect_qualifier",
                                    "qualifier_value": "expression"
                                },
                                {
                                    "qualifier_type_id": "biolink:qualified_predicate",
                                    "qualifier_value": "biolink:causes"
                                }
                            ]
                        },
                        {
                            "qualifier_set":
                            [
                                {
                                    "qualifier_type_id": "biolink:object_direction_qualifier",
                                    "qualifier_value": "increased"
                                },
                                {
                                    "qualifier_type_id": "biolink:object_aspect_qualifier",
                                    "qualifier_value": "secretion"
                                },
                                {
                                    "qualifier_type_id": "biolink:qualified_predicate",
                                    "qualifier_value": "biolink:causes"
                                }
                            ]
                        },
                        {
                            "qualifier_set":
                            [
                                {
                                    "qualifier_type_id": "biolink:object_direction_qualifier",
                                    "qualifier_value": "decreased"
                                },
                                {
                                    "qualifier_type_id": "biolink:object_aspect_qualifier",
                                    "qualifier_value": "degradation"
                                },
                                {
                                    "qualifier_type_id": "biolink:qualified_predicate",
                                    "qualifier_value": "biolink:causes"
                                }
                            ]
                        }
                    ]
                },
                "e4":
                {
                    "exclude": false,
                    "subject": "n2",
                    "object": "n0",
                    "predicates":
                    [
                        "biolink:contributes_to",
                        "biolink:associated_with",
                        "biolink:gene_associated_with_condition"
                    ]
                }
            },
            "nodes":
            {
                "n0":
                {
                    "ids":
                    [
                        "MONDO:0009061"
                    ],
                    "is_set": false
                },
                "n1":
                {
                    "categories":
                    [
                        "biolink:ChemicalEntity"
                    ],
                    "is_set": true
                },
                "n2":
                {
                    "categories":
                    [
                        "biolink:Gene",
                        "biolink:Protein"
                    ],
                    "is_set": true
                },
                "n3":
                {
                    "categories":
                    [
                        "biolink:ChemicalEntity"
                    ],
                    "is_set": false
                }
            }
        }
    }
}

results: https://arax.ncats.io/?r=152776

These queries were submitted via the ARAX UI at https://arax.ncats.io/

CaseyTa avatar Jul 28 '23 20:07 CaseyTa

@CaseyTa We're trying to prioritize this issue while keeping the code freeze in mind. How would you rate the importance of resolving this issue? Does this block the clinical ARA from working before the code freeze?

dkoslicki avatar Aug 02 '23 18:08 dkoslicki

For future reference: on /beta this query executes and returns 0 results due to no KPs satisfying e2. On the plain arax.ncats.io, it looks like there is a FET issue:

INFO: Pruning back n1 nodes because there are more than 50
2023-08-02T18:13:53.589507 DEBUG: Using FET to assess quality of intermediate answers in Expand
2023-08-02T18:13:53.589625 DEBUG: Overlaying FET for n0-->n1 (from Expand)
2023-08-02T18:13:53.626672 WARNING: FET produced an error when Expand tried to use it to prune the KG. Log was: Response: status: ERROR n_errors: 2 n_warnings: 0 n_messages: 11 error_code: UnknownError message: Something went wrong with retrieving edges in message KG - 2023-08-02T18:13:53.617359 ERROR: [TypeError] Traceback (most recent call last): File "/mnt/data/orangeboard/production/RTX/code/UI/OpenAPI/python-flask-server/openapi_server/../../../../ARAX/ARAXQuery/Overlay/fisher_exact_test.py", line 186, in fisher_exact_test edge_attribute_list = [x.value for x in self.message.knowledge_graph.edges[edge_key].attributes if x.attribute_type_id == 'EDAM-DATA:1772'] TypeError: 'NoneType' object is not iterable - 2023-08-02T18:13:53.617419 ERROR: [UnknownError] Something went wrong with retrieving edges in message KG
2023-08-02T18:13:53.626695 DEBUG: Will continue pruning without overlaying FET

dkoslicki avatar Aug 02 '23 18:08 dkoslicki

@CaseyTa We're trying to prioritize this issue while keeping the code freeze in mind. How would you rate the importance of resolving this issue? Does this block the clinical ARA from working before the code freeze?

@dkoslicki Thanks, my opinion is that this doesn't need to be a high priority before the code freeze. Adding the exclude edge allows us to clean up the results a bit, but I think we can viably continue development and testing without this edge for now.

@karafecho @bill-baumgartner, please chime in also.

CaseyTa avatar Aug 02 '23 18:08 CaseyTa

I agree with Casey that this issue need not be prioritized prior to the code freeze. Our goal is to have the TQS fully tested and deployed to CI (staging) prior to the September relay meeting. As such, this issue can be resolved after the code freeze. Thanks!

karafecho avatar Aug 02 '23 20:08 karafecho

Thanks for the clarification both! We'll put in on the "post September" queue

dkoslicki avatar Aug 03 '23 01:08 dkoslicki

Revisiting this one, it appears that the query executes properly, but that no results are returned. Increasing the number of intermediate nodes to consider does make it run for a long time, but still with no results for e0. I'll go ahead and close this as the mysterious 3 hour wait appears to have been resolved. But do let me know if this or related issues arise

dkoslicki avatar Jun 19 '24 16:06 dkoslicki