airflow-provider-fivetran icon indicating copy to clipboard operation
airflow-provider-fivetran copied to clipboard

Frequent timeouts using FivetranSensor

Open jtalmi opened this issue 2 years ago • 10 comments

Hi all, we have been experiencing frequent timeouts using FivetranSensor operators on our airflow instances. We are running airflow 2.2.4.

Fivetran support instructed us to open an issue here since it appears to be an airflow issue rather than a Fivetran issue.

Please let me know any additional information you require :)

jtalmi avatar May 19 '22 02:05 jtalmi

Thanks for reaching out, Jonathan! Do you know how many DAGs you are running at the same time and how many workers you are using when you are seeing these sensor issues?

PubChimps avatar May 26 '22 21:05 PubChimps

Hey Nick. We are running close to 3-5 DAGs at the same time, and they are running on different pods. We use the K8s executor.

Best, Jonathan On May 26, 2022, 5:19 PM -0400, Nick Acosta @.***>, wrote:

Thanks for reaching out, Jonathan! Do you know how many DAGs you are running at the same time and how many workers you are using when you are seeing these sensor issues? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

jtalmi avatar May 26 '22 22:05 jtalmi

Thanks for letting me know. And I know this is a very relative question but would you say the connectors you are experiencing issues with are "long-running connectors"? Would you say you see this behavior with all connectors, even ones that take a couple seconds or minutes, or just connectors that take a couple hours?

PubChimps avatar May 31 '22 21:05 PubChimps

These connectors usually take a few minutes max.

Best, Jonathan On May 31, 2022, 5:13 PM -0400, Nick Acosta @.***>, wrote:

Thanks for letting me know. And I know this is a very relative question but would you say the connectors you are experiencing issues with are "long-running connectors"? Would you say you see this behavior with all connectors, even ones that take a couple seconds or minutes, or just connectors that take a couple hours? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

jtalmi avatar May 31 '22 21:05 jtalmi

hey @jtalmi - jumping in from Astronomer to help out. What are the error messages you're getting?

virajmparekh avatar Jun 02 '22 03:06 virajmparekh

FYI the tasks are failing on timeout

We have a task timeout set at 2 hours. The task typically takes 1 minute, but every few hours (the DAG runs every hour) the task takes the full two hours.

[2022-06-03, 13:03:47 UTC] {taskinstance.py:1700} ERROR - Task failed with exception Traceback (most recent call last): File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1329, in _run_raw_task self._execute_task_with_callbacks(context) File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1455, in _execute_task_with_callbacks result = self._execute_task(context, self.task) File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1511, in _execute_task result = execute_callable(context=context) File "/home/airflow/.local/lib/python3.9/site-packages/airflow/sensors/base.py", line 240, in execute raise AirflowSensorTimeout(f"Snap. Time is OUT. DAG id: {log_dag_id}") airflow.exceptions.AirflowSensorTimeout: Snap. Time is OUT. DAG id: duoplane_product_report_dag_v1 [2022-06-03, 13:03:47 UTC] {taskinstance.py:1267} INFO - Immediate failure requested. Marking task as FAILED. dag_id=duoplane_product_report_dag_v1, task_id=duoplane_products_s3_to_snowflake_fivetran_sensor, execution_date=20220603T100000, start_date=20220603T110246, end_date=20220603T130347 [2022-06-03, 13:03:47 UTC] {standard_task_runner.py:89} ERROR - Failed to execute job 39399 for task duoplane_products_s3_to_snowflake_fivetran_sensor Traceback (most recent call last): File "/home/airflow/.local/lib/python3.9/site-packages/airflow/task/task_runner/standard_task_runner.py", line 85, in _start_by_fork args.func(args, dag=self.dag) File "/home/airflow/.local/lib/python3.9/site-packages/airflow/cli/cli_parser.py", line 48, in command return func(*args, **kwargs) File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/cli.py", line 92, in wrapper return f(*args, **kwargs) File "/home/airflow/.local/lib/python3.9/site-packages/airflow/cli/commands/task_command.py", line 298, in task_run _run_task_by_selected_method(args, dag, ti) File "/home/airflow/.local/lib/python3.9/site-packages/airflow/cli/commands/task_command.py", line 107, in _run_task_by_selected_method _run_raw_task(args, ti) File "/home/airflow/.local/lib/python3.9/site-packages/airflow/cli/commands/task_command.py", line 180, in _run_raw_task ti._run_raw_task( File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/session.py", line 70, in wrapper return func(*args, session=session, **kwargs) File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1329, in _run_raw_task self._execute_task_with_callbacks(context) File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1455, in _execute_task_with_callbacks result = self._execute_task(context, self.task) File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1511, in _execute_task result = execute_callable(context=context) File "/home/airflow/.local/lib/python3.9/site-packages/airflow/sensors/base.py", line 240, in execute raise AirflowSensorTimeout(f"Snap. Time is OUT. DAG id: {log_dag_id}") airflow.exceptions.AirflowSensorTimeout: Snap. Time is OUT. DAG id: duoplane_product_report_dag_v1 [2022-06-03, 13:03:47 UTC] {local_task_job.py:154} INFO - Task exited with return code 1

Best, Jonathan On Jun 1, 2022, 11:27 PM -0400, viraj parekh @.***>, wrote:

hey @jtalmi - jumping in from Astronomer to help out. What are the error messages you're getting? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

jtalmi avatar Jun 03 '22 15:06 jtalmi

Hey @PubChimps @virajmparekh, resurfacing this as we're still experiencing a lot of failures.

jtalmi avatar Jun 13 '22 02:06 jtalmi

I believe this may be because your sensor is not monitoring the sync coming from FivetranOperator correctly. If so, I have a fix for this using xcoms, I just need to do a little bit more testing and hope to have this packaged and in a new release by the end of the week

PubChimps avatar Jun 13 '22 23:06 PubChimps

Thanks Nick!

On Jun 13, 2022, 7:38 PM -0400, Nick Acosta @.***>, wrote:

I believe this may be because your sensor is not monitoring the sync coming from FivetranOperator correctly. If so, I have a fix for this using xcoms, I just need to do a little bit more testing and hope to have this packaged and in a new release by the end of the week — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

jtalmi avatar Jun 14 '22 01:06 jtalmi

Ok Jonathan, could you update to version 1.1.0 of the provider and try it out using this example?

PubChimps avatar Jun 15 '22 23:06 PubChimps

Thanks again, @jtalmi. Closing this issue, it should be resolved in versions after 1.1.0

PubChimps avatar Sep 13 '22 16:09 PubChimps

image

@PubChimps I am using xcom for short term connectors and getting an exception