Feedback issue: inf NGD edge as evidence for treats?
The following issue was logged in the Translator Feedback repo: https://github.com/NCATSTranslator/Feedback/issues/987
I'm not entirely certain what the objection here is, but if one looks at the referenced result: https://arax.ncats.io/?r=45666c50-06bf-48ba-82b5-2bfae1b764d7
The third result is:
The support graph is:
which might be okay in some circumstances. But in this case, the value is inf:
If NGD is infinite, should there even be an edge there at all?? I'm thinking no.
I think the main complaint from the Feedback issue is that the primary knowledge source is ARAX itself:
which seems to be frowned upon. Although might be okay if ARAX is the entity computing NGDs.
Can someone on the team more familiar with Expand and NGD assess this situation and respond to this and the original issue?
So ARAX itself is computing the NGDs, so I don't know what else we would put there as the primary_knowledge_source. That is an interesting thought about not including such edges if the NGD is Inf; currently we were just down-weighting in the ranker.
@chunyuma do you know how easy or difficult it would be to just not include edges with NGD=inf?
@chunyuma do you know how easy or difficult it would be to just not include edges with NGD=inf?
I also realized this issue and this is what I plan to do in the ranker update. It is not difficult and I have added the code to exclude those edges. But I need to verify whether excluding the ranker results with NDG = INF affects too much, as I realized that for the target results of some test suites, they contains the NDG=inf. If we excluded all results with NDG=inf, some target results will be excluded as well. So, do we proceed with this idea @dkoslicki @edeutsch.
I think checking the suggested change versus the old way of doing it via the testing suite, like you propose, makes sense to me. I.e. try it, and see if it badly affects the test performance
I suppose one thing I wonder about is: where did the treats edge even come from? Presumably it came from xDTD? And xDTD's only evidence is an inf NGD relationship? Maybe this relationship just should not be in xDTD?
So xDTD doesn't consider NGD edges: it's predictions are based on local geometric neighborhoods and other embedding information (such as bioBERT embeddings of name, category, etc). For drug repurposing, I don't think it's a bad thing to consider drugs that have not been mentioned in a literature for that particular disease yet. The NGD edges are just added afterwards like they are in every response
ah, of course. At the risk of revealing more of my ignorance, may I ask a few more questions?
So I had naively understood that the analysis support graph was * the * explanation for why xDTD was offering the particular treats triple. But that is probably not really the case? Perhaps the real truth about why xDTD offers a particular treats triple is because "the model says so with a confidence above X". Which seems reasonable/obvious for a ML algorithm, one usually can't know exactly why the answer is what it is. But then I've backed myself into a corner: what then exactly * is * the provided support graph?
Consider this result, which is a bit more illustrative: https://arax.ncats.io/?r=328722
The top answer "siltuximab treats Castleman Disease" has a pleasant analysis support graph:
but what is that really? presumably not * the * reason the xDTD model offers that treats triple, because we can't really know that ("the model just says so"). But rather what is displayed is just the training data that is relevant here? basically what KG2 has to say about this relationship? Is that right?
And then the lowest answer in that result set is "lomustine treats Castleman disease". It is there "because the model said so with a probability above X". But since KG2 has nothing about that relationship, there are no concrete edges displayed, the only edge is the NGD edge. Which in this case is also inf because there is nothing in the literature. And if course it gets a terrible result ranking score because the only edge in the support graph is the NGD edge with inf distance.
So the lesson for me then is: xDTD support graphs are not "why the model predicted a treats triple", but rather, the true "why" is unknown, and the support graph just shows what KG2 knows about the two nodes plus an NGD edge. Which might just be NGD=inf if there is nothing in the literature?
Have I (finally) understood this correctly? or not really?
It is then interesting to then ponder if pathfinder can help here. Because if you ask the new pathfinder how lomustine and Castleman Disease are related, it will sing about the hundred ways: https://arax.ncats.io/?r=328724
Might it be fun to replace the current KG2-based support graph for xDTD with, say, a merger of KG2 and the "best" 5 analyses from pathfinder between the two concepts or something?
thank you for reading this far..
Hi @edeutsch, thanks for your comments and interesting ideas.
So I had naively understood that the analysis support graph was * the * explanation for why xDTD was offering the particular treats triple. But that is probably not really the case? Perhaps the real truth about why xDTD offers a particular treats triple is because "the model says so with a confidence above X". Which seems reasonable/obvious for a ML algorithm, one usually can't know exactly why the answer is what it is. But then I've backed myself into a corner: what then exactly * is * the provided support graph? but what is that really? presumably not * the * reason the xDTD model offers that treats triple, because we can't really know that ("the model just says so"). But rather what is displayed is just the training data that is relevant here? basically what KG2 has to say about this relationship? Is that right?
First, I don't think the analysis support graph was * the * explanation for xDTD prediction, but the edge support graphs are expected to be. The analysis support graph seems always the NGD edge provided by ARAX. For xDTD, it has two modules: one (prediction module) was mainly used to predict a probability score for how likely a given drug treats a disease; the other one (MOA module) was mainly used to predict and extract the biologically meaningful and possible paths for the explanation, although it doesn't really explain why model makes such a prediction (because it is a kind of post-hoc explanation).
Different ML/DL models used different mechanism for predictions and explanations. For xDTD, it mainly depends on the KG2's topological structure and utilizes the neighborhood nodes and message passing mechanism of GNN for prediction. Therefore, its predictions might not necessarily have the support from the existing literature (because they might be new findings), and so it is reasonable to have a analysis support graph with NGD = inf. However, the edge support graphs from the MOA module of xDTD provides sort of potential explanations.
And then the lowest answer in that result set is "lomustine treats Castleman disease". It is there "because the model said so with a probability above X". But since KG2 has nothing about that relationship, there are no concrete edges displayed, the only edge is the NGD edge. Which in this case is also inf because there is nothing in the literature. And if course it gets a terrible result ranking score because the only edge in the support graph is the NGD edge with inf distance.
Currently, the ranker hasn't used the NDG score from the analysis support graph for ranking, unless these NDG scores were in the edge attributes. The reason why the "lomustine treats Castleman disease" was the lowest answer is "completely" not because its NDG is inf and no concrete edges displayed. Its rank is low because of several reasons: number of edges in the results is small (only one prediction from xDTD), the xDTD prediction score is not high (only 0.7219), no concrete edges from a very reliable sources (e.g., drugbank, drugcentral).
So the lesson for me then is: xDTD support graphs are not "why the model predicted a treats triple", but rather, the true "why" is unknown, and the support graph just shows what KG2 knows about the two nodes plus an NGD edge. Which might just be NGD=inf if there is nothing in the literature?
As I mentioned above, the xDTD support graphs just offer sort of potential explanation for the prediction. Of course, some predictions are new and lack substantial evidence. However, this is why we called it "inferred"/"creative" mode because it is still unknown (just inferred).
Might it be fun to replace the current KG2-based support graph for xDTD with, say, a merger of KG2 and the "best" 5 analyses from pathfinder between the two concepts or something?
I think the xDTD KG2-based support graph and the paths from pathfinder are two different kinds of results. the xDTD KG2-based support graph are the model-predicted KG2-based concrete paths, while the pathfinder paths are the KG2-based concrete paths based on some human-required criteria.
Hope my answers make you clearer.
It seems like excluding results with NGD == inf will affect the ranker's performance based on the test suites although the influence is minor.
Results with NDG == inf Included:
Results with NDG == inf excluded:
@dkoslicki @edeutsch, any idea whether we should exclude NDG == inf?
Yes, go ahead @chunyuma and exclude xDTD edges with only NGD == inf. In cases with multiple evidence types, just remove the edge with NGD == inf.
The justification was that we need to emphasize user experience, and saying something exists in the literature, and yet we weren't able to calculate it.
See result 4 analysis support graph of https://arax.ncats.io/?r=329486 see HHV-8... edge.
Looks like the Inf NGD edges in the analysis support graphs is a red-herring. Instead, the main issue was missing edge support graph. @chunyuma will do a check in the xDTD pre-compute database to see how many "treats" edges are missing any edge support graphs. The goal being to see how long it would take to use pathfinder on such examples in order to ensure that every edge has a support graph (be it from the xDTD RL model, or the Pathfinder approach).
Hi @dkoslicki @edeutsch,
Just have a check in the pre-compute database, there are around 60% "treats" edges that are missing any edge support graphs. So, for them, if we need to run the Pathfinder approach, we might need to first estimate how long we need to take for the pre-computation.
@mohsenht I know Pathfinder takes about 12 seconds when you invoke it from the UI, but if you initiate a Pathfinder call locally (eg. Load a python module, then just invoke it on the CLI), do you know how long it takes?
Good news Chunyu, at least it's embarrassingly parallel and doesn't require GPU compute. We can even explore a local PloverDB so we don't need to take the latency hit of reaching out over the WWW.
Hi @dkoslicki,
Pathfinder has two steps:
- Find paths without edges: It consists only paths with nodes. Takes about 10 seconds
- Find all edges between nodes that are neighbors. It depends on how many paths we want in the output. As we discussed today in the meeting, if it is just a handful pack of paths it takes 1~2seconds.
Probably it takes around 10~12seconds for each call, if we call it locally.
I haven't tried using a local PloverDB, however you are right and PloverDB calls are the bottleneck of Pathfinder.
Update: I ran 30 Pathfinder requests for the first process and on average it takes 10 seconds
Thanks @dkoslicki @mohsenht.
Great! Then probably we can try running the PathFinder during the next round of xDTD model training. @mohsenht, if possible, could you kindly generate a sample script to show me how to run Pathfinder locally given a drug-disease pair? Thank you!
Hi @edeutsch ,
@chunyuma plans to use Pathfinder during xDTD model training and will likely need to dynamically select a limited number of paths for his training data. As you know, Pathfinder's TRAPI JSON query doesn't currently support this parameter. I’ve added it to the Pathfinder module so it can be accessed via ARAXi (DSL), but unfortunately, DSL isn’t working with Pathfinder at the moment.
Here’s the issue detailing the problem with DSL not working for Pathfinder.
Thanks, everyone.
My uninformed $0.02: I feel that if there is an NGD score for a virtual edge that is "infinity", per my understanding of NGD, this means there is basically no semantic relatedness for the subject and object concepts, and thus, there shouldn't be such a NGD virtual edge connecting those concept nodes in the graph.
I'm not sure if anyone is suggesting this, but I'd be hesitant to exclude a "creative mode" predicted (i.e., xDTD-predicted) drug-disease result based on the absence of an interesting NGD connection. My understanding of "creative mode" is that one of its main purposes is to recommend interesting new drug-disease pairs that presumably wouldn't have literature co-occurrence because they are new (and thus, presumably, unstudied together).