Almost cycle causes standup error
In the standup query: https://arax.ncats.io/?source=ARS&id=cdccafa8-3ec1-4c61-a2fe-c4f72f55c5d1 For the JSON:
{
"edges": {
"e0": {
"object": "n1",
"predicates": [
"biolink:related_to"
],
"subject": "n0"
},
"e1": {
"object": "n1",
"predicates": [
"biolink:related_to"
],
"subject": "n2"
},
"e2": {
"object": "n0",
"predicates": [
"biolink:related_to"
],
"subject": "n2"
}
},
"nodes": {
"n0": {
"categories": [
"biolink:Disease"
],
"ids": [
"MONDO:0010161"
]
},
"n1": {
"categories": [
"biolink:Gene"
]
},
"n2": {
"categories": [
"biolink:ChemicalSubstance"
]
}
}
}
The QG looks like:
n0->n2<-n1
^ |
|—————|
And throws the error 2021-07-12T21:54:32.681037 ERROR: [InteralError_F260] Reached loop max: 21
I don’t know if this is a query graph interpreter issue @edeutsch or an expand error @amykglen .
The mystery deepens: running the whole thing in DSL results in resultify complaining about edge e2 (Mondo to chemical substance): https://arax.ncats.io/?r=15954
However, querying just with this edge does find things: https://arax.ncats.io/?r=15956
So perhaps there are no paths/KGs that can satisfy this query (though ARAX shouldn’t be throwing an error)
hmm, yeah, so the error with the JSON query indeed seems to be a query graph interpreter issue.
but in the case of the DSL query (which avoids running into the QGI issue), I think the (almost) cycle might be confusing expand (specifically its pruning function)... I'll investigate.
And @edeutsch looks like the error is here. Github is telling me you wrote that to address a different issue (#1248), but perhaps with these sorts of queries, we would want to enforce edge directions, so as to circumvent cycles. As an aside, are true cycles fundamentally disallowed in our system? I can't immediately see why they would be, save for 99.9% of queries not being cycles, so we didn't really code for them.
The reason is mostly that it is useful to know where the starting point of a query_graph and where the ending point is so that the display can properly know and show the "answer nodes", and also so the query graph interpreter has a "starting point" for matching templates. It it probably not fundamentally necessary but since 99.9% of queries are not cycles, it seemed like a reasonable shortcut to make that assumption. This could/should be fixed, but I am on vacation this week and don't have the available time to think about how to address this. I wouldn't be able to get to trying to address this until the end of the month.
I see; no worries @edeutsch , we can figure it out later (eg. CURIE nodes as a "starting point"). Enjoy the vacation!
ok, I fixed the expand portion of the problem - the equivalent DSL query is working for me now in master... (it does find results).
so I think when the issues with QG interpretation are eventually fixed we should be good to go here!
Great, thanks @amykglen ! I've been fiddling with the QGI, but can't get it to do anything except linear queries so far
This works in prod now, and appears to respect the QGI (and actually gives reasonable results from what I could Google). Closing