Does it make sense to consider the "lookup" edges from KP as a "treat" edge in the creative mode?
In one of the failing automated tests, there is an edge with “COVID-19 vaccine treats multiple sclerosis” and appears to come from “clinical trials creative expand”.
The screenshot shows the edge info is MESH:D000086663 biolink:treats MONDO:0005301 (see figure1). However, the "support_graphs" edge attribute seems to indicate that this edge might be from the infores:multiomics-clinicaltrails kp. After checking with Gwênlyn, the original predicate of this edge is biolink:in_clinical_trials_for. Although it is one of descendants of biolink:treats_or_applied_or_studied_to_treat, it doesn't make sense to consider it as a "treat" edge unless elevate_to_prediction is True.
Should we leave the biolink:in_clinical_trials_for predicate and show the result (even though the query graph was a creative biolink:treats query)?
Copying from the Slack discussion:
CTKP has MESH:D000086663 biolink:in_clinical_trials_for MONDO:0005301, based on NCT05286242. There is no 'treats' edge.
There was plenty of discussion on how some such assertions could be 'elevated' to a treats prediction downstream from CTKP. To support this, I added a field elevate_to_prediction ... which happens to be False for this edge.
Some trials have just one intervention that is unique to experimental arms, in which case I assume that is indeed what is being tested. Those end up labeled elevate_to_prediction = True. If there's more than one such intervention, it's less clear what exactly is being tested, and lacking that confidence, elevate_to_prediction is False. Furthermore, the phase of the trial has to be higher than PHASE1 and lower than PHASE4. (If it is PHASE4, it gets its separate biolink:treats assertion anyway, so no need to predict.)
It would be important though to make the distinction between a treats assertion and a treats prediction. Upgrading a weaker assertion to a treats assertion is wrought. Upgrading to a prediction is fine, if there's good rationale for doing so.
An alternative might be to change the lookup type result's predicate to treats, but also tack on EPC metadata to explain that this is an ARA prediction based on a simple heuristic being applied to a lookup-type-result. Maybe we are already doing this, and I am just uneducated on the matter.
Predicted edges should be clearly labeled with knowledge_level = prediction, and any downstream users should pay attention to that.
Does anyone have a query graph that reproduces this issue? Or a test case ID?
OK, I see now that there was knowledge_level: prediction from the get-go.
I'd nevertheless like to also see what this result looks like, in the Translator UI.
OK, if you run this QG through ARAX via the ARAX web browser interface on arax.ci.transltr.io:
{
"edges": {
"t_edge": {
"attribute_constraints": [],
"knowledge_type": "inferred",
"object": "on",
"predicates": [
"biolink:treats"
],
"qualifier_constraints": [],
"subject": "sn"
}
},
"nodes": {
"on": {
"categories": [
"biolink:Disease"
],
"constraints": [],
"ids": [
"MONDO:0005301"
],
"is_set": false
},
"sn": {
"categories": [
"biolink:ChemicalEntity"
],
"constraints": [],
"is_set": false
}
}
}
and from the results page, https://arax.ci.transltr.io/?r=358475
if you look at result 240, you'll see this issue.
Here is the result graph:
"creative_expand_treats_edge:MESH:D000086663--treats--MONDO:0005301--infores:arax": {
"attributes": [
{
"attribute_source": "infores:arax",
"attribute_type_id": "biolink:agent_type",
"attributes": null,
"description": null,
"original_attribute_name": null,
"value": "automated_agent",
"value_type_id": null,
"value_url": null
},
{
"attribute_source": "infores:arax",
"attribute_type_id": "biolink:knowledge_level",
"attributes": null,
"description": null,
"original_attribute_name": null,
"value": "prediction",
"value_type_id": null,
"value_url": null
},
{
"attribute_source": "infores:arax",
"attribute_type_id": "biolink:support_graphs",
"attributes": null,
"description": null,
"original_attribute_name": null,
"value": [
"aux_graph_infores:multiomics-clinicaltrials:MESH:D000086663--biolink:in_clinical_trials_for--None--None--None--MONDO:0005301--infores:multiomics-clinicaltrials_creative_expand_treats_group_t_edge"
],
"value_type_id": null,
"value_url": null
}
],
"object": "MONDO:0005301",
"predicate": "biolink:treats",
"qualifiers": null,
"sources": [
{
"resource_id": "infores:arax",
"resource_role": "primary_knowledge_source",
"source_record_urls": null,
"upstream_resource_ids": null
}
],
"subject": "MESH:D000086663"
},
and the support graph:
{
"analyses": [
{
"attributes": null,
"edge_bindings": {
"t_edge": [
{
"attributes": [],
"id": "creative_expand_treats_edge:MESH:D000086663--treats--MONDO:0005301--infores:arax"
}
]
},
"resource_id": "infores:arax",
"score": 0.688,
"scoring_method": null,
"support_graphs": [
"aux_graph_N1_471"
]
}
],
"confidence": null,
"description": "No description available",
"essence": "COVID-19 Vaccines",
"essence_category": "['biolink:Drug', 'biolink:SmallMolecule']",
"id": null,
"node_bindings": {
"on": [
{
"attributes": [],
"id": "MONDO:0005301",
"query_id": null
}
],
"sn": [
{
"attributes": [],
"id": "MESH:D000086663",
"query_id": null
}
]
},
"resource_id": "infores:arax",
"result_group": null,
"result_group_similarity_score": null,
"row_data": [
0.688,
"COVID-19 Vaccines",
"['biolink:Drug', 'biolink:SmallMolecule']"
],
"score": null,
"score_direction": null,
"score_name": null
},
and the aux graph info:
"aux_graph_infores:multiomics-clinicaltrials:MESH:D000086663--biolink:in_clinical_trials_for--None--None--None--MONDO:0005301--infores:multiomics-clinicaltrials_creative_expand_treats_group_t_edge": {
"attributes": [],
"edges": [
"infores:multiomics-clinicaltrials:MESH:D000086663--biolink:in_clinical_trials_for--None--None--None--MONDO:0005301--infores:multiomics-clinicaltrials"
]
},
I'm not wild about the fact that in this example, the user (apparently? I am not an expert) has to inspect the support graph to see that this prediction is based on elevation of an in_clinical_trials_for predicate in a triple from a KP.
A possible ARAX GUI feature could be that if any two nodes are only connected by edges of biolink:treats with knowledge_level: prediction, then the title bar could be made a more cautionary color. blue for lookup answer and orange for prediction?