Full Refresh arg in invocation_config does not work via DataformCreateWorkflowInvocationOperator
I'm trying to run Dataform in "Full Refresh" mode with this Airflow Operator definition and it straight up doesn't work. It only runs the repo in non-full refresh mode. This leaves us with effectively no way to run our production DAG via "Full Refresh". I tried sorting through the code but I can't see where the arg is getting dropped.
This means we effectively have no way to "Full Refresh" our Dataform tables
create_workflow_invocation_full_refresh = DataformCreateWorkflowInvocationOperator(
task_id="create_workflow_invocation_full_refresh",
project_id=PROJECT_ID,
region=DATAFORM_REGION,
repository_id=REPOSITORY_ID,
workflow_invocation={
"compilation_result": "{{ task_instance.xcom_pull('create_compilation_result')['name'] }}",
"invocation_config": {
"fully_refresh_incremental_tables_enabled": True,
}
}
)
Class Definition: https://github.com/apache/airflow/blob/34ed71e52c1d8356194d34cb5018ff4032d66e2f/providers/google/src/airflow/providers/google/cloud/operators/dataform.py#L187
Opened a sister issue in the Airflow repo, I don't know who manages these operators: https://github.com/apache/airflow/issues/53843
This doesn't seem to be about our open-source framework, but about the Dataform API. To file an issue in our Public Tracker seems like a more appropriate place.
I'd recommend checking first:
- The version of the Dataform client used by your version of
Airflowto make sure that it has this argument. - Validate incremental tables that they don't have protections against full refresh
If you validate this, we'll need from you details how we can identify an example of your workflow invocation in Dataform: project number, location, repository id, workflow invocation id (can be sent privately).