airflow icon indicating copy to clipboard operation
airflow copied to clipboard

apache-airflow-providers-amazon/8.7.1 uses wrong log group for fargate container

Open 5e9148d9 opened this issue 1 year ago • 3 comments
trafficstars

Apache Airflow Provider(s)

amazon

Versions of Apache Airflow Providers

8.7.1

Apache Airflow version

2.7.2

Operating System

AWS service

Deployment

Amazon (AWS) MWAA

Deployment details

No response

What happened

Python DAG file defines EcsRunTaskOperator.awslogs_group="my-loggroup". And we got error in logs

[2024-02-08 18:56:36,345] Cannot find log stream yet, it can take a couple of seconds to show up. If this error persists, check that the log group and stream are correct: group: my-loggroup	stream: ecs/my-container/b478044071514a32b3bf8c7dee5ebea4
[2024-02-08 18:56:36,420] ECS Task stopped, check status: {removed}
[2024-02-08 18:56:36,485] Task failed with exception
Traceback (most recent call last):
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/providers/amazon/aws/operators/ecs.py", line 585, in execute
    self._after_execution()
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/providers/amazon/aws/operators/ecs.py", line 610, in _after_execution
    self._check_success_task()
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py", line 746, in decorator_f
    return fun(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/providers/amazon/aws/operators/ecs.py", line 731, in _check_success_task
    self.task_log_fetcher.get_last_log_messages(self.number_logs_exception)
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/providers/amazon/aws/utils/task_log_fetcher.py", line 103, in get_last_log_messages
    response = self.hook.conn.get_log_events(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/botocore/client.py", line 535, in _api_call
    return self._make_api_call(operation_name, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/airflow/.local/lib/python3.11/site-packages/botocore/client.py", line 980, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.ResourceNotFoundException: An error occurred (ResourceNotFoundException) when calling the GetLogEvents operation: The specified log stream does not exist.
[2024-02-08 18:56:36,515] Marking task as UP_FOR_RETRY. dag_id=my_dag_id_name, task_id=my_step1_name, execution_date=20240208T185300, start_date=20240208T185304, end_date=20240208T185636
[2024-02-08 18:56:36,540] Failed to execute job 111 for task data_update (An error occurred (ResourceNotFoundException) when calling the GetLogEvents operation: The specified log stream does not exist.; 172)
[2024-02-08 18:56:36,649] Task exited with return code 1
[2024-02-08 18:56:36,689] 0 downstream tasks scheduled from follow-on schedule check

Issue that fargate task is created with another log group and value from EcsRunTaskOperator.awslogs_group does not override it

What you think should happen instead

Value EcsRunTaskOperator.awslogs_group should override default fargate task definition log group name.

How to reproduce

  1. Create python DAG which run ecs fargate task.
  2. Set custom log group name in EcsRunTaskOperator.awslogs_group (should be different from log group name defined in fargate task definition)
  3. Run the pipeline and get traceback

Anything else

No response

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

5e9148d9 avatar Feb 09 '24 13:02 5e9148d9

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

boring-cyborg[bot] avatar Feb 09 '24 13:02 boring-cyborg[bot]

Python DAG file defines EcsRunTaskOperator.awslogs_group="my-loggroup". And we got error in logs

Issue that fargate task is created with another log group and value from EcsRunTaskOperator.awslogs_group does not override it

Doesn't seems a bug to me, logging only collected from the valid log groups provided by the users.

Taragolis avatar Feb 16 '24 11:02 Taragolis

Python DAG file defines EcsRunTaskOperator.awslogs_group="my-loggroup". And we got error in logs Issue that fargate task is created with another log group and value from EcsRunTaskOperator.awslogs_group does not override it

Doesn't seems a bug to me, logging only collected from the valid log groups provided by the users.

Well, I do agree with you to some extent but there two points that quite misleading.

  1. EcsRunTaskOperator.awslogs_group does not override task definition value EcsRunTaskOperator.awslogs-stream-prefix overrides task definition value

    For me it is pretty strange behaviour

  2. From documentation:

awslogs_group (str | None) – the CloudWatch group where your ECS container logs are stored. Only required if you want logs to be shown in the Airflow UI after your job has finished.

I don't see here something like "Use the name from task definition log group. All other names will fail."

5e9148d9 avatar Feb 21 '24 09:02 5e9148d9

This issue has been automatically marked as stale because it has been open for 14 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.

github-actions[bot] avatar Mar 07 '24 00:03 github-actions[bot]

This issue has been closed because it has not received response from the issue author.

github-actions[bot] avatar Mar 14 '24 00:03 github-actions[bot]