adaptive
adaptive copied to clipboard
Blocking behavior of the runner
(original issue on GitLab)
opened by Anton Akhmerov (@anton-akhmerov) at 2017-11-13T14:39:40.396Z
Right now the runner is stopped when the user launches anything that blocks the kernel, and that is dangerous in the context of hpc. Say, a single %debug
will halt the cluster computations until the user finishes debug. We should revisit this behavior and think about providing safeguards.
originally posted by Joseph Weston (@jbweston) at 2017-11-13T17:25:12.648Z on GitLab
Getting rid of this restriction is going to be tough.
At the moment the runner and the kernel are able to run in the same thread through use of cooperative multitasking (i.e. coroutines). This makes it trivial to be able to access the learner from the kernel while the runner is doing its job, because we know that the kernel may only run when the runner is await
ing soming, at which time the learner is in a well-defined state.
As you mention above, the disadvantage of cooperative multitasking is that if one coroutine refuses to yield control (a blocking kernel, say) then no other coroutines can work (the runner cannot advance). If you want to lift this restriction, then you have to use another mechanism for controlling access to the shared resource (the learner). Experience tells us that this needs to be done carefully
originally posted by Anton Akhmerov (@anton-akhmerov) at 2017-11-14T08:08:32.017Z on GitLab
I was rather thinking along the lines of reducing the communication channels to the runner by offloading it to a separate process. This would of course restrict our capacity to interact with the learner.
originally posted by Joseph Weston (@jbweston) at 2018-02-19T12:07:53.750Z on GitLab
We now have a BlockingRunner
that blocks the kernel. Is this good enough?
originally posted by Bas Nijholt (@basnijholt) at 2018-02-19T12:10:58.076Z on GitLab
I would say yes.
However, you do mention an issue that is still there. I think it's wise to add some more explanation about what happens where. That the function that is learned is executed in the executor
and the learner
methods are called in the same thread as the notebook. So blocking the notebook thread means the learner
can't suggest new points to the executor
.
originally posted by Joseph Weston (@jbweston) at 2018-02-19T12:34:22.159Z on GitLab
For now we can do something like:
def _run(learner, *args, **kwargs):
BlockingRunner(learner, *args, **kwargs)
return learner
def run_in_background(learner, *args, executor=None, ioloop=None, **kwargs):
return ioloop.run_in_executor(executor, _run, learner, *args, **kwargs)
We won't be able to interact with the learner; we'll only be able to cancel and check/get the result, but this is fine for v0.1
originally posted by Joseph Weston (@jbweston) at 2018-02-19T12:41:28.540Z on GitLab
Ah, but this won't quite work.
The executor in which we want to run _run
is completely independent from the executor in which we want to run the BlockingRunner
.
It is not clear to me how we can get this context into a subprocess without resorting to hacks like passing a string to _run
that will then be exec
d (importing the necessary modules and instantiating an executor)