airflow-operator
airflow-operator copied to clipboard
Local Executor: display worker logs in Stackdriver
When we run our workers using the Local executor, the worker logs are placed in a local folder (/usr/local/airflow/logs/). It would be nice if we could also see these logs in Stackdriver since they disappear when the scheduler pod is restarted.
@dimberman how do we redirect the logs to stdout instead of usr/local/logs ?
@barney-s How about side-car? Build a fluentd sidecar to monitor all files under /usr/local/logs/
Should we adjust the airflow logging to also log to stdout?
I'm not sure about this, and python logging configs combined with airflow configs is not the easiest duo.
But currently airflow.task
has propagate=False in the default configs, and only links to a file handler.
If we change that to propagate=True, will task logs then hit stdout and then stackdriver?
What does Cloud Composer do?
One puzzle was why logs go to stdout with airflow run
or airflow test
, but not when run by a worker. I think that's because of https://sourcegraph.com/github.com/apache/airflow@1c43cde/-/blob/airflow/bin/cli.py#L675 and https://sourcegraph.com/github.com/apache/airflow@1c43cde/-/blob/airflow/bin/cli.py#L520
(edited from previous)
This is my latest attempt: https://github.com/SixtyCapital/infrastructure/blob/ff3fb23b5a1b266cde51ccb1d7b9aef745118d44/docker/airflow/airflow_local_settings.py
Weirdly when I set AIRFLOW__CORE__LOGGING_CONFIG_CLASS: "airflow_local_settings.K8S_LOGGING_CONFIG"
, I get pods using up so much memory that they're evicted en masse.
I wasn't sure what could be causing that and so moved on, but interesting if anyone gets it to work
@max-sixty any luck on this? Airflow 1.10.4 has enabled writing to STDOUT via the elasticsearch handler. Link to Merged PR
Unfortunately, it's an all or nothing :( Reference to the Logging Module
The ideal state is for the code to emit to STDOUT and write to disk.
I haven't tried - interested if you can get it working though!
@max-sixty So I didn't really get the logging working with the provided configuration values. Ended up extending File Task Handler with stream handler, much like how the new Elasticsearch Implementation Handler works.
Had time to put this quick repo together as an example. https://github.com/kbbqiu/airflow-stdout-log-handler
Great!
Yes same issue re your point https://github.com/kbbqiu/airflow-stdout-log-handler#issues-i-ran-into. Would be v interesting to know what's actually happening there. Your solution looks great though.