airflow-operator icon indicating copy to clipboard operation
airflow-operator copied to clipboard

Local Executor: display worker logs in Stackdriver

Open jcunhasilva opened this issue 6 years ago • 9 comments

When we run our workers using the Local executor, the worker logs are placed in a local folder (/usr/local/airflow/logs/). It would be nice if we could also see these logs in Stackdriver since they disappear when the scheduler pod is restarted.

jcunhasilva avatar Feb 11 '19 10:02 jcunhasilva

@dimberman how do we redirect the logs to stdout instead of usr/local/logs ?

barney-s avatar Feb 11 '19 20:02 barney-s

@barney-s How about side-car? Build a fluentd sidecar to monitor all files under /usr/local/logs/

jie8357IOII avatar Mar 17 '19 02:03 jie8357IOII

Should we adjust the airflow logging to also log to stdout?

I'm not sure about this, and python logging configs combined with airflow configs is not the easiest duo. But currently airflow.task has propagate=False in the default configs, and only links to a file handler.

If we change that to propagate=True, will task logs then hit stdout and then stackdriver?

What does Cloud Composer do?

max-sixty avatar Apr 25 '19 03:04 max-sixty

One puzzle was why logs go to stdout with airflow run or airflow test, but not when run by a worker. I think that's because of https://sourcegraph.com/github.com/apache/airflow@1c43cde/-/blob/airflow/bin/cli.py#L675 and https://sourcegraph.com/github.com/apache/airflow@1c43cde/-/blob/airflow/bin/cli.py#L520

max-sixty avatar Apr 25 '19 03:04 max-sixty

(edited from previous)

This is my latest attempt: https://github.com/SixtyCapital/infrastructure/blob/ff3fb23b5a1b266cde51ccb1d7b9aef745118d44/docker/airflow/airflow_local_settings.py

Weirdly when I set AIRFLOW__CORE__LOGGING_CONFIG_CLASS: "airflow_local_settings.K8S_LOGGING_CONFIG", I get pods using up so much memory that they're evicted en masse.

I wasn't sure what could be causing that and so moved on, but interesting if anyone gets it to work

max-sixty avatar Apr 25 '19 15:04 max-sixty

@max-sixty any luck on this? Airflow 1.10.4 has enabled writing to STDOUT via the elasticsearch handler. Link to Merged PR

Unfortunately, it's an all or nothing :( Reference to the Logging Module

The ideal state is for the code to emit to STDOUT and write to disk.

kbbqiu avatar Aug 24 '19 06:08 kbbqiu

I haven't tried - interested if you can get it working though!

max-sixty avatar Aug 24 '19 07:08 max-sixty

@max-sixty So I didn't really get the logging working with the provided configuration values. Ended up extending File Task Handler with stream handler, much like how the new Elasticsearch Implementation Handler works.

Had time to put this quick repo together as an example. https://github.com/kbbqiu/airflow-stdout-log-handler

kbbqiu avatar Aug 27 '19 17:08 kbbqiu

Great!

Yes same issue re your point https://github.com/kbbqiu/airflow-stdout-log-handler#issues-i-ran-into. Would be v interesting to know what's actually happening there. Your solution looks great though.

max-sixty avatar Aug 27 '19 17:08 max-sixty