astro-sdk
astro-sdk copied to clipboard
Analyze methods for running a DAG locally
Please describe the feature you'd like to see
A clear and concise description of what problem feature solves or what use-case it would enable.
Since the SQL CLI will be a new method for SQL engineers to develop Airflow DAGs on their local machines, we should take a broader look at how we can optimize the local DAG writing experience. The current airflow dags test command is incredibly slow and requires running an entire scheduler on the user machine, creating confusing scheduler-level logs that do not help the user debug.
Describe the solution you'd like We want to create a local DAG writing solution that is both faster and only gives relevant information to the user regarding their task logs and status. This solution does not necessarily need all scheduler features (e.g., parallelization, retries, etc.). Part of this analysis would be determining which of these features are necessary.
Are there any alternatives to this feature? That is what we would like to find out
Additional context No Acceptance Criteria
- [x] A decision document that outlines potential improvements to the local airflow dev experience
- [ ] An AEP similar to our table cleanup proposal with possible ways to run a DAG locally, their pros and cons, and recommendations for future development
What is missing:
- [ ] add AEP
Follow-up tickets:
- [ ] backport Airflow (Astronomer runtime or CLI itself - TBD)
@dimberman to add a TL;DR
tl;dr of part 1: How we've improved DAG running
We've added https://github.com/apache/airflow/pull/26400 to the open source airflow project that will be released in Airflow 2.5. This feature felt like a critical piece of debugging architecture that we wanted to offer to the entire airflow community.
This new DAG runner no longer requires running a backfill job with a scheduler heartbeat. It also offers much more concise error logging and IDE integrations.
tl;dr of part 2: Steps on how we're going to offer local DAG development to astronomer customers
After discussing with @jedcunningham and @sunkickr I think we've come to the following conclusions:
-
Backports to earlier airflow versions are purely for critical bugfixes and shouldn't add extra functionality. We'd be opening a can of works by making an exception to this functionality, so backporting https://github.com/apache/airflow/pull/26400 is out of the question.
-
Adding an extra file to the runtime images is also not ideal and would make runtime building more complex
With these in mind, here is how I think we will proceed:
In the short run, I'm going to make a runnable script for @sunkickr to unblock him so he can start testing user flows with this local DAG running capability. We will use a docker image as the backend to avoid the complexities of the different user system setups. This can be re-visited later.
In the medium term, I'm going to add the dag.test functionality to the astro python SDK. Users with <2.5 can access this functionality by installing the python SDK and running astro.run_dag(dag). We will also add more features to this than we will to the OSS since we can have a much faster iteration speed and don't need to worry about backporting.
The main blocker in us using the astro.run_dag(dag) method is that we need to merge https://github.com/astronomer/astro-sdk/pull/970 and release it. XCOM_PICKLING is a non-starter for a number of users so we can't have it as a requirement if we are going to make astro python SDK a default download for our runtime images.
Ok so speaking with @tatiana further I think we have an even better solution.
We're going to include the dag running script in the SQL CLI project initially.
Since the SQL CLI is not yet yet 1.0, we have much more flexibility in what we include. This can also make sure that we don't need the DAG testing code as a part of astro SDK 1.2. So the plan is as follows:
- the astro run command will be added to SQL CLI
- Astro runtime images with import SQL CLI and astro SDK >=1.2
- We will add the dag testing code to astro-python-sdk for version 1.3.
- from version 1.3+ users will be able to run the dag testing code in their IDEs.