telemetry-airflow icon indicating copy to clipboard operation
telemetry-airflow copied to clipboard

Replace SubDAGs with TaskGroups (bug 1866141)

Open sean-rose opened this issue 8 months ago • 2 comments

Bug 1866141: Replace Airflow SubDAGs with TaskGroups

SubDAGs were deprecated in Airflow v2.0, and it appears they'll be removed entirely in v3.0. TaskGroups are Airflow's recommended alternative to SubDAGs, so this replaces all usages of SubDAGs with TaskGroups.

cc @mikaeld

sean-rose avatar Nov 29 '23 00:11 sean-rose

Have you tested any of the DAGs that use moz_dataproc_pyspark_runner?

No, I haven't tested those. Getting local development for Dataproc tasks set up is kind of daunting with the multiple buckets and permissions configuration needed (we could use Dataproc equivalents of make gke and make clean-gke), there appears to be a bug in part of the Dataproc dev logic, and the two active DAGs using Dataproc that are fully set up for local Dataproc development (using utils.dataproc.copy_artifacts_dev) both take longer than an hour to run.

@mikaeld would it be reasonable to deploy this branch to the Airflow dev environment for testing? Do you know if a valid google_cloud_airflow_dataproc connection is set up in that environment?

sean-rose avatar Nov 29 '23 19:11 sean-rose

@sean-rose create a RC pre-release (like this), e.g. 2.7.3.1rc and you can trigger a dev deployment using this tag.

mikaeld avatar Dec 04 '23 19:12 mikaeld