airflow-provider-great-expectations icon indicating copy to clipboard operation
airflow-provider-great-expectations copied to clipboard

Feature Request: pass parameters from Airflow to GE Checkpoint

Open kujaska opened this issue 1 year ago • 0 comments

We need to run a GE checkpoint from Airflow. Checkpoint is based on SQL query. SQL query must get values for its parameters from Airflow - e.g. a datamart should be checked for DQ for particular date and region after that date and region were refreshed from another Airflow task.

Part of checkpoint.yml looks like:

validations:
  - batch_request:
      datasource_name: snowflake
      data_connector_name: default_runtime_data_connector_name
      data_asset_name: db1.table1
      runtime_parameters:
        query: "SELECT *
        	from db1.table1
	        WHERE fld1 > $DATE_PARAM_FROM_AIRFLOW and fld2 = $REGION_PARAM_FROM_AIRFLOW
"

How to do it properly with GreatExpectationsOperator?

Looks like it can't pass parameters only, while query_to_validate or checkpoint_config will break unit tests (you will need airflow to test your checkpoint!)

Workaround: use environment variables.

Thanks!

kujaska avatar Apr 19 '23 07:04 kujaska