dagster
dagster copied to clipboard
[docs] - Guide for creating custom run launchers
Summary
Add a guide for creating a custom run launcher to the Open Source deployment docs. This section is currently all that exists on this topic.
Conversation
This issue was generated from the slack conversation at: https://dagster.slack.com/archives/C01U954MEER/p1636907764186200?thread_ts=1636907764.186200&cid=C01U954MEER
Conversation excerpt:
U02M5L5KCBX: Hi guys,I am curious if there is any documentation/blog post/code example about details how run launcher, executor communicates with the rest of the system.
I tried to create a run launcher for Azure Container Instance but found out I am missing the basic knowledge about all these details :slightly_smiling_face:. I went through the documentation several times but didn't find anything that could really help me. I also try to check DockerRunLauncher
but could not figure out how this works.
U016C4E5CP8: Hi Pavel - there's a diagram with the high level setup here: https://docs.dagster.io/deployment/overview#architecture but unfortunately there aren't great docs about writing your own run launcher yet. To start I don't think you need to worry about the executor piece, at least until you have a run launcher up and running - the default executor should work out of the box to start.
Ultimately all that the run launcher needs to do is launch a process in your execution environment of choice that runs the command "dagster api execute_run" with the right arguments (which are usually the same for all run launchers - the only thing that differs is the mechanics for how the process is spun up, e.g. in Docker/K8s/Azure/etc.).
What the DockerRunLauncher does is create a container with that command (link) , then run that command by calling container.start(). U016C4E5CP8: <@U018K0G2Y85> docs better docs for creating your own Run Launcher U02M5L5KCBX: Hi Daniel, thanks a lot for your answer! I was able to create the Azure Container Instance following your suggested path. It is still a little bit confusing for me. I will try to describe it as I see it now :slightly_smiling_face: The run launcher creates the instance and starts the executor which uses the same dagster.yaml for reading/writing to these storages and that's actually the way how it communicates with the rest of the system. It reads the pipeline definition from there and writes progress and other metadata info there. That means the storages have to be remotely accessible in the same way on these different nodes. Is that correct? One thing I don't understand is why an executor needs run_launcher as well. In my case, there are settings that I would prefer not to pass to those nodes. Maybe I am missing something else why this is needed there. U016C4E5CP8: That's exactly right - one small detail - what you're calling an 'executor' is typically called a 'run worker' in the docs: https://docs.dagster.io/deployment/overview#job-execution-flow - it then runs an executor which is a python class. But that's just a detail. Your last question is a great point and something we want to improve - https://github.com/dagster-io/dagster/issues/5674 U016C4E5CP8: in that end state, the run worker nodes would still need to be able to reach the storages, but wouldn't need to be able to load the run launcher
Message from the maintainers:
Are you looking for the same documentation content? Give it a :thumbsup:. We factor engagement into prioritization.