seamless
seamless copied to clipboard
Nested transformation improvements
seamless.direct.transformer
wraps a function inside a DirectTransformer
object, which launches direct transformations (seamless.direct.Transformation
) when called. In addition, direct transformations can also be created from unbound high-level Transformer objects (Transformer.get_transformation). Nested transformation is when direct transformations are created inside an existing transformation.
There are two kinds of nested transformation: local and non-local (delegated). By default, DirectTransformer
objects have local=None
, meaning that delegated nested transformation is tried first. Local nested transformation is then used as a fallback.
Local nested transformation already works. After the transformation a been launched in a forked seamless.core.execute.execute
call, the forking modulates any subsequent call involving the seamless.direct.run
machinery. Namely, seamless.direct.run
will now forward local nested transformation calls to the parent process via a parent process queue. Some improvement may be needed, because currently, all calls are queued up until any call is waited for, causing all calls to be launched-and-waited-for only then.
Non-local nested transformation means that an assistant must be available inside the transformer. This doesn't work for any of the current assistants (micro, mini or mini-dask). This will be a bit complicated in cases where the assistant lives on a user machine whereas the job is executed on a cluster. Barring some kind of reverse tunneling or websockets, one solution for dask-based execution is to make a "in-process assistant" as a thin wrapper around the Dask scheduler (which is by necessity available for each worker). Add to the assistant protocol a "release lock"/"acquire lock" APl. For the Dask in-process assistant, theses will be simple wrappers around Client.secede()
and Client.rejoin()
.
There is now an InProcessAssistant class.
Instead of communicating to the Dask scheduler, a worker could also try reach the Dask client inside the original assistant. In that case, use the same Dask mechanism as https://github.com/sjdv1982/seamless/issues/219, and probably store an ID that identifies the original assistant (since multiple assistants can connect to the scheduler).
See also #241
Non-local nested transformation now works for a "local" mini assistant (devel), by setting the Seamless assistant IP address to "localhost" inside the mini-assistant-devel Docker image.
(the meaning of "local" is getting a bit confused here!)