docker-airflow
docker-airflow copied to clipboard
Intermittent Airflow Log went missing
Expected behaviour Airflow Log will be in place unless someone delete it.
Actual behaviour Intermittent Airflow task log went missing.
Information
-
Version: 1.10.4
-
Executor: Celery Executor
-
Single EC2 Instance with Docker Webserver and Scheduler + Multiple Celery Worker hosted on different EC2 + Single EC2 for RabbitMQ + RDS
I have configured a few of my DAGs to retry multiple times, for this case is 4 times. As you can see from the pictures, 3 retries has failed and the 4th is successful. I suppose I can see all the logs for every retry in Airflow Web UI. But the log for second and third retries is gone and showing below error
Failed to fetch log file from worker. 404 Client Error: NOT FOUND for url: http://xx.xx.xx.xx:xxxx/log/xxx/xxx/2020-06-03T08:00:00+00:00/3.log
I also noticed this is not the only DAG has this behaviour, other DAG also has the same pattern just that some shows the logs on seconds and fourth retries and the missing one are first and third retries.
Steps to reproduce the behavior Unable to reproduce since this is an intermittent issue.
Log exists

Log missing


I've been getting the same error and related ones. Sometimes logs just don't show up, or we'll get a "no hostname supplied" error. The cause seems to be a breakdown in communication between the workers, scheduler, and backend DB. But so far we haven't identified a solution for this.
Anyone else running into this or find a fix?