docker-airflow icon indicating copy to clipboard operation
docker-airflow copied to clipboard

Intermittent Airflow Log went missing

Open yyseik opened this issue 4 years ago • 1 comments

Expected behaviour Airflow Log will be in place unless someone delete it.

Actual behaviour Intermittent Airflow task log went missing.

Information

  • Version: 1.10.4

  • Executor: Celery Executor

  • Single EC2 Instance with Docker Webserver and Scheduler + Multiple Celery Worker hosted on different EC2 + Single EC2 for RabbitMQ + RDS

I have configured a few of my DAGs to retry multiple times, for this case is 4 times. As you can see from the pictures, 3 retries has failed and the 4th is successful. I suppose I can see all the logs for every retry in Airflow Web UI. But the log for second and third retries is gone and showing below error

Failed to fetch log file from worker. 404 Client Error: NOT FOUND for url: http://xx.xx.xx.xx:xxxx/log/xxx/xxx/2020-06-03T08:00:00+00:00/3.log

I also noticed this is not the only DAG has this behaviour, other DAG also has the same pattern just that some shows the logs on seconds and fourth retries and the missing one are first and third retries.

Steps to reproduce the behavior Unable to reproduce since this is an intermittent issue.

Log exists 1-ok 4-ok

Log missing 2-missing

3-missing

yyseik avatar Jun 05 '20 11:06 yyseik

I've been getting the same error and related ones. Sometimes logs just don't show up, or we'll get a "no hostname supplied" error. The cause seems to be a breakdown in communication between the workers, scheduler, and backend DB. But so far we haven't identified a solution for this.

Anyone else running into this or find a fix?

pnewell88 avatar Aug 18 '20 16:08 pnewell88