airflow icon indicating copy to clipboard operation
airflow copied to clipboard

log_filename_template writes different try_number, then when it reads the log

Open kuikeelc opened this issue 3 years ago • 7 comments

Apache Airflow version

2.3.3 (latest released)

What happened

When you use remote_logging = True and push the logs to GCS. It writes the log exactly as defined in log_filename_template. However, when reading the log back, it adds 1 and Airflow gives the error that it can't find the logfile.

Our current setting is:

log_filename_template = dag_id={{ "{{" }} ti.dag_id {{ "}}" }}/run_id={{ "{{" }} ti.run_id {{ "}}" }}/task_id={{ "{{" }} ti.task_id {{ "}}" }}/{%% if ti.map_index >= 0 %%}map_index={{ "{{" }} ti.map_index {{ "}}" }}/{%% endif %%}attempt={{ "{{" }} ti.try_number {{ "}}" }}.log

Every task in try_number 1 will be read as 2.log, which is not correct.

What you think should happen instead

No response

How to reproduce

Use v2.3.3 icm with GCS and remote_logging = True

Operating System

Linux

Versions of Apache Airflow Providers

No response

Deployment

Docker-Compose

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

kuikeelc avatar Jul 11 '22 07:07 kuikeelc

Thanks for opening your first issue here! Be sure to follow the issue template!

boring-cyborg[bot] avatar Jul 11 '22 07:07 boring-cyborg[bot]

@uranusjr @dstandish @jedcunningham -> I think it should be related to recent fixes with log templates?

potiuk avatar Jul 11 '22 10:07 potiuk

It seems to be a GUI bug. The files are uploaded to Google Cloud Storage with the correct attempt numbers. In the GUI (Task Instance > Log > Select an Attempt, e.g. 1,2,3,etc.) however it always seems to select the max(attempt)+1 for all attempts (1,2,3,4,etc.)

kuikeelc avatar Jul 12 '22 12:07 kuikeelc

@bbovenzi -> isn't that already on your radar ?

potiuk avatar Jul 12 '22 12:07 potiuk

@kuikeelc is this issue in both the grid view and the /log page?

bbovenzi avatar Aug 08 '22 18:08 bbovenzi

Hi,

yes. It is both in the Grid as well as the Graph GUI when I press "Log". This is understandable, since they both lead to the same page as far as I can see.

It seems that the incorrect attempt_nr is being sent along. It seems it tries to retrieve the next attempt (= last attempt + 1) instead of the actual attempt_number.

Op ma 8 aug. 2022 om 20:39 schreef Brent Bovenzi @.***>:

@kuikeelc https://github.com/kuikeelc is this issue in both the grid view and the /log page?

— Reply to this email directly, view it on GitHub https://github.com/apache/airflow/issues/24958#issuecomment-1208473795, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2FWCDQEOHBMRAXQAU574TVYFH63ANCNFSM53GSBDBA . You are receiving this because you were mentioned.Message ID: @.***>

kuikeelc avatar Aug 09 '22 11:08 kuikeelc

If it helps. This is the error message I get when requesting the logfile:

*** Unable to read remote log from gs://[HIDDEN LOCATION]/dag_id=osb__unit_test/run_id=scheduled__2022-08-07T22:15:00+00:00/task_id=from_aws_s3_multiple_csv_no_dedup_test/attempt=2.log

As you can see, it tries to request attempt=2.log. But that file does not exist. The task ran once successfully and the maximum attempt is therefore

  1. So it would work if the attempt number is used when request the log from Google Cloud Storage

kuikeelc avatar Aug 09 '22 14:08 kuikeelc

@bbovenzi Did you get a chance to have a look at this?

kuikeelc avatar Oct 06 '22 17:10 kuikeelc

This issue has been automatically marked as stale because it has been open for 365 days without any activity. There has been several Airflow releases since last activity on this issue. Kindly asking to recheck the report against latest Airflow version and let us know if the issue is reproducible. The issue will be closed in next 30 days if no further activity occurs from the issue author.

github-actions[bot] avatar Oct 07 '23 07:10 github-actions[bot]

This issue has been closed because it has not received response from the issue author.

github-actions[bot] avatar Nov 07 '23 07:11 github-actions[bot]