airflow icon indicating copy to clipboard operation
airflow copied to clipboard

Use base aws classes in amazon provider Operators/Sensors/Triggers

Open Taragolis opened this issue 2 years ago • 10 comments

Body

Right now some inconsistency exists between different Operators/Sensors/Triggers which are use AwsBaseGeneric hook. After #34784 and #34884 we could make all of them (except Transfer Operators) use same subset of generic attributes.

It might make more easier to end users use this objects (don't need to think is it region or region_name or is it even support this arguments), as well as make easier extend by new new hook parameters to all this operators in the one go.

General Approach for solve each case:

  • Use AwsBaseOperator for operators, AwsBaseSensor for sensors. In some very specific case e.g. multi inheritance required to use AwsBaseHookMixin (see: AppflowRecordsShortCircuitOperator)
  • Provide additional arguments to AwsBaseWaiterTrigger (if required)
  • Wrap template_fields parameters
  • Attach generic documentation into the service specific page into the Amazon Provider Documentation
  • Validate that all parameters propagated to operator/sensor specific hook

All this steps already done for some of the operators, so it could be use as a reference

  • [x] Use base aws classes in Amazon AppFlow Operators (https://github.com/apache/airflow/pull/35082)
  • [x] Use base aws classes in Amazon Athena Operators/Sensors/Triggers (https://github.com/apache/airflow/pull/35133)
  • [ ] Use base aws classes in AWS Batch Operators/Sensors/Triggers (https://github.com/apache/airflow/pull/35226)
  • [x] Use base aws classes in AWS CloudFormation Operators/Sensors https://github.com/apache/airflow/pull/36771
  • [x] Use base aws classes in AWS Datasync Operators https://github.com/apache/airflow/pull/36766
  • [x] Use base aws classes in AWS DMS Operators/Sensors https://github.com/apache/airflow/pull/36772
  • [x] Use base aws classes in Amazon DynamoDB Sensors https://github.com/apache/airflow/pull/36770
  • [x] Use base aws classes in Amazon EC2 Operators/Sensors/Triggers https://github.com/apache/airflow/pull/47931
  • [x] Use base aws classes in Amazon ECS Operators/Sensors/Triggers (https://github.com/apache/airflow/pull/36393)
  • [x] Use base aws classes in Amazon EKS Operators/Sensors/Triggers https://github.com/apache/airflow/pull/48192
  • [x] Use base aws classes in Amazon EMR Operators/Sensors/Triggers https://github.com/apache/airflow/pull/49486
  • [x] Use base aws classes in Amazon EventBridge Operators https://github.com/apache/airflow/pull/36765
  • [ ] Use base aws classes in AWS Glue Operators/Sensors/Triggers
  • [x] Use base aws classes in AWS Glue Data Catalog Sensors https://github.com/apache/airflow/pull/40492
  • [x] Use base aws classes in AWS Glue Crawlers Operators/Sensors/Triggers https://github.com/apache/airflow/pull/40504
  • [x] Use base aws classes in AWS Glue DataBrew Operators/Triggers https://github.com/apache/airflow/pull/41848
  • [x] Use base aws classes in AWS Lambda Operators/Sensors/Triggers (https://github.com/apache/airflow/pull/34890)
  • [x] Use base aws classes in Amazon QuickSight Operators/Sensors https://github.com/apache/airflow/pull/36776
  • [x] Use base aws classes in Amazon RDS Operators/Sensors/Triggers https://github.com/apache/airflow/pull/48872
  • [ ] Use base aws classes in Amazon Redshift clusters Operators/Sensors/Triggers
  • [x] Use base aws classes in AWS Redshift Data API Operators https://github.com/apache/airflow/pull/36764
  • [x] Use base aws classes in Amazon S3 Operators/Sensors/Triggers https://github.com/apache/airflow/pull/47321
  • [x] Use base aws classes in Amazon S3 Glacier Operators/Sensors (https://github.com/apache/airflow/pull/35108)
  • [ ] Use base aws classes in Amazon SageMaker Operators/Sensors/Triggers
  • [x] Use base aws classes in Amazon SNS Operators https://github.com/apache/airflow/pull/36615
  • [x] Use base aws classes in Amazon SQS Operators/Sensors/Triggers https://github.com/apache/airflow/pull/36613
  • [x] Use base aws classes in AWS Step Functions Operators/Sensors/Triggers https://github.com/apache/airflow/pull/36468

Committer

  • [X] I acknowledge that I am a maintainer/committer of the Apache Airflow project.

Taragolis avatar Oct 30 '23 18:10 Taragolis

Thanks for getting this set up!

ferruzzi avatar Nov 22 '23 20:11 ferruzzi

@Taragolis i am happy to look into this Use base aws classes in AWS Glue Crawlers Operators/Sensors/Triggers ? or anyone is working already?

gopidesupavan avatar May 30 '24 05:05 gopidesupavan

@Taragolis I am happy to take EMR, EC2 and S3 (operators) here

prakharcode avatar May 30 '24 23:05 prakharcode

@vincbeck @eladkal Working on this Use base aws classes in AWS Glue Data Catalog Sensors didnt find anything in the documentation and example code for this. Planning to add them here https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/operators/glue.html is this fine?

gopidesupavan avatar Jun 28 '24 17:06 gopidesupavan

@vincbeck @eladkal Working on this Use base aws classes in AWS Glue Data Catalog Sensors didnt find anything in the documentation and example code for this. Planning to add them here https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/operators/glue.html is this fine?

Please yes :)

vincbeck avatar Jun 28 '24 18:06 vincbeck

@gopidesupavan do you plan to address the other Glue services that were left?

eladkal avatar Aug 25 '24 08:08 eladkal

@eladkal Yes will update other services, coming week.

gopidesupavan avatar Aug 26 '24 08:08 gopidesupavan

I'll take a look at EC2.

ellisms avatar Feb 27 '25 16:02 ellisms

I'd love to see the RDS sensors extend from the AwsBaseSensor. We have a use-case where we need to support multiple regions, and requiring a new Connection for each region is painful.

hinchliff avatar Mar 18 '25 14:03 hinchliff

Glue Databrew was added in #41848

I'll have a look at the remaining Glue ones.

dominikhei avatar Jun 17 '25 06:06 dominikhei

Glue Databrew was added in #41848

I'll have a look at the remaining Glue ones.

Speaking on this. The GlueJobCompleteTrigger currently does not come with a waiter in boto3. @ellisms as you implemented the Operator which pattern do you think is preferable? Is there something I am overlooking.

  1. Not letting the Trigger inherit from AwsBaseWaiterTrigger and just port the Sensor. This is like it is done with the GlueCatalogPartitionSensor which inherits from the base class, whilst its trigger does not

  2. Or create a new custom waiter and port the Trigger to the AwsBaseWaiterTrigger?


I have now opted for the second option, will test all changes with dags and then open a Draft PR

dominikhei avatar Jun 19 '25 11:06 dominikhei

@dominikhei I'd go with option 2. A quick scan of the current waiter code in the hook doesn't reveal anything special. A custom waiter would be able to handle it.

ellisms avatar Jun 23 '25 17:06 ellisms