RTX icon indicating copy to clipboard operation
RTX copied to clipboard

Unusual pause in Expansion?

Open edeutsch opened this issue 4 years ago • 3 comments

Marc Duby reported an issue where the traffic light table was showing some long KP query times: image

You'll notice that all KPs including RTX-KG2 took at least 60 seconds.

BUT, the weird thing is here: image

Although I'm not completely certain, I think the corresponding KG2 query is this: image

The first odd thing is that it is only 4 seconds. but the second is that it is only 30 results instead of 14692.

It may be that I'm the one who's confused. https://arax.ncats.io/?r=30835

{
  "edges": {
    "e00": {
      "subject": "n00",
      "object": "n01"
    },
    "e01": {
      "subject": "n01",
      "object": "n02"
    }
  },
  "nodes": {
    "n00": {
      "ids": [
        "DOID:14330"
      ],
      "is_set": false,
      "categories": [
        "biolink:Disease"
      ],
      "name": "DOID:14330"
    },
    "n01": {
      "is_set": false,
      "categories": [
        "biolink:NamedThing"
      ]
    },
    "n02": {
      "ids": [
        "DOID:8778"
      ],
      "is_set": false,
      "categories": [
        "biolink:Disease"
      ],
      "name": "DOID:8778"
    }
  }
}

Hmm, so I just ran it again and the same thing happened with the above query. image

ARAX didn't send the query for some 80 seconds. And then after sending it, it only took 5 seconds and returned 30 results. but the traffic light says: image

ah, I suppose there's a distinction between results and edges, but still, that's a pretty big different.

Something weird is going on, but I'm not quite sure what.

edeutsch avatar Oct 26 '21 20:10 edeutsch

the number of results is usually tiny when ARAX queries KG2 because ARAX hacks the system by setting is_set=true on most qnodes since it doesn't care about the results (only the knowledge graph), and that makes processing much faster. so this is expected.

and as for the 76 seconds - I think in this case DTD's time was effectively added to KG2's (and the other KPs') times, because DTD doesn't contain any non-blocking code (so as soon as the system starts working on DTD, it won't go back to any other KP threads until DTD is done, which took 65 seconds). it's a bummer that this means the other KPs' times aren't accurate, but not sure of a way around that..

amykglen avatar Oct 26 '21 20:10 amykglen

ah okay, that makes sense, thanks. And it might either do DTD before or after KG2? That would be consistent with something else: in the first example above, KG2 launched very quickly but still claimed to take a long time, while in the second case, it didn't launch for over 60 seconds.

great, thanks, I think I understand.

It seems like the way to solve this is to do the DTD work in a separate thread with something like this:

        dtd_thread = threading.Thread(target=self.do_dtd_stuff(), args=(query,mode,))
        # the extra thread wanders off into self.do_dtd_stuff() but then afterwards ends up back here and keeps going
        dtd_thread.start()

        # Need a way to let the main thread get in here to do more stuff, but prevent the extra thread from also wandering
        # in here some time later when it's done with self.do_dtd_stuff(). both threads share all memory.
	if you're not the dtd thread:
		do all the asyncio stuff

        dtd_thread.join()

The part that I always find tricky is the "if you're not the dtd thread" part. After the extra thread completes it continues at dtd_thread.start() and you have to make sure that it doesn't also do the "do all the asyncio stuff". I think I usually resort to some brittle state setting in self that lets the main thread enter the "if" quickly, but then prevents the extra thread from getting in there. The extra thread maybe needs to have an extra delay to make sure that the main thread safely gets into its "if" before the extra thread gets there. I'm not very good at this stuff. But if you can solve that little problem, everything else seems to work well and easily.

edeutsch avatar Oct 26 '21 21:10 edeutsch

@edeutsch @amykglen is this still relevant or good to close?

finnagin avatar Feb 02 '22 06:02 finnagin

sufficiently stale that I think this is good to close.

edeutsch avatar Jun 26 '24 06:06 edeutsch