execnet icon indicating copy to clipboard operation
execnet copied to clipboard

multi: add timeout to reply.get

Open JCourt1 opened this issue 2 years ago • 3 comments

It is possible for this call to hang, like the other calls in this function. Passing through the timeout removes that possibility.

This was first noted on #43

JCourt1 avatar Sep 18 '23 17:09 JCourt1

@JCourt1 do you have a example of the hang? the code in question should pass in all cases for threading, is there a different execmodel involved

(the killfunc has a own timeout that should apply)

RonnyPfannschmidt avatar Sep 19 '23 12:09 RonnyPfannschmidt

should pass in all cases for threading

Yes I agree with you in principle, I was confused by this. I can say though that empirically I am hitting the issue... Unfortunately I'm not able to reproduce this in a minimal example outside of the codebase I am encountering it in. It's through pytest-xdist, and I think that does just use "thread" as the execmodel. I just hit it again actually:

  File ".../venv/lib/python3.11/site-packages/xdist/dsession.py", line 87, in pytest_sessionfinish
    nm.teardown_nodes()
  File ".../venv/lib/python3.11/site-packages/xdist/workermanage.py", line 81, in teardown_nodes
    self.group.terminate(self.EXIT_TIMEOUT)
  File ".../venv/lib/python3.11/site-packages/execnet/multi.py", line 214, in terminate
    safe_terminate(
  File ".../venv/lib/python3.11/site-packages/execnet/multi.py", line 310, in safe_terminate
    reply.get()
  File ".../venv/lib/python3.11/site-packages/execnet/gateway_base.py", line 282, in get
    self.waitfinish(timeout)
  File ".../venv/lib/python3.11/site-packages/execnet/gateway_base.py", line 289, in waitfinish
    if not self._result_ready.wait(timeout):
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.pyenv/versions/3.11.5/lib/python3.11/threading.py", line 622, in wait
    signaled = self._cond.wait(timeout)
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../.pyenv/versions/3.11.5/lib/python3.11/threading.py", line 320, in wait
    waiter.acquire()
KeyboardInterrupt

re. this:

This looks a bit like each action is getting 2 spawns (one outer,one inner)

The invocation nesting Looks incorrect

I get the impression that it is that way just to allow the termkills to be run in parallel. But do we even need the for loop over replyList? Could this be deleted:

    for reply in replylist:
        reply.get()

JCourt1 avatar Sep 19 '23 16:09 JCourt1

I believe the intent was to complete all Tasks from the pool

Its unclear what's is needed for the other backends

RonnyPfannschmidt avatar Sep 19 '23 16:09 RonnyPfannschmidt