Dependency management between tasks

Open julienlavergne opened this issue 3 years ago • 0 comments

Is your feature request related to a problem? Please describe

Slightly inspired from Ansible meta file and the upcoming usage of graphlib, it would be nice to handle dependencies between tasks. By task, I mean the complete content of a python file in the tasks folder (assuming we follow the directory structure of the documentation).

If one task B depends on a task A, then running the task B will also run task A before that; regardless of whether the user is running directly the task B or if the task B is run though a local.include.

Another useful feature is to be able to run task B automatically after task A is run (reverse dependency).

A lot of things already work by doing a local.include of the dependent tasks, but it would not work for reverse dependency.

Describe the solution you'd like

When using local.include, we call exec_file and I can see that we already get a dict containing the attributes of the script. I propose that we could reserve two names : dependencies and run (or whatever suits best).

If run is present, the task is added to the dependency graph with dependencies as predecessors in the graph. Then we need of final function to run all the tasks of the graph.

The advantages is that there is no impact on existing tasks, they continue to be executed as soon as local.include is called. If user want to use the dependency mechanism, he can place the code in the run function and optionally define the dependencies attributes in the script.

I tested following code in util.exec_file and it does the job, but I did not handle the construction of the dependency graph.

exec(PYTHON_CODES[filename], data)

if "dependencies" in data:
    for d in data["dependencies"]:
        exec_file(f"tasks/{d}.py")

if "run" in data:
    data["run"]()

May 31 '22 07:05 julienlavergne