public-datasets-pipelines
public-datasets-pipelines copied to clipboard
Consider utilizing GKE Operators in anticipating of shifting to Composer v2
Composer v2 was released this week, and it uses GKE Autopilot under the hood. With GKE Autopilot, we can no longer use node pool affinity with the KubernetesPodOperator. In anticipation of eventually migrating to Composer v2 we can utilize the GKEStartPodOperator, a child of the KubernetesPodOperator in one of two ways:
- Create a long lasting GKE Cluster with node pools that is managed outside of Airflow, use the
GKEStartPodOperatorwith node pool affinity as needed. - Programmatically create an ephemeral GKE Cluster with node pools using the
GKECreateClusterOperator, use theGKEStartPodOperatorto launch pods with node pool affinity, and tear down the cluster with theGKEDeleteClusterOperator
This was brought up as a discussion regarding #180