emissary
emissary copied to clipboard
Enable configuring ambassador STDOUT logs as JSON format
Please describe your use case / problem.
After recently setting the envoy_log_type: json
option as part of enabling Datadog APM Tracing most access logs are written as JSON. It appears however that many log messages are still written to STDOUT as text.
This was unexpected and makes it significantly more complicated to define a log pipeline configuration which can parse all the different log message formats.
Describe the solution you'd like A single option for configuring log format of these ambassador pods and the ability to configure all pod logs to be JSON would be preferred.
Describe alternatives you've considered As a workaround I will experiment with additional logging configuration to support the multiple patterns coming from a single pod.
Additional context
Here are some examples of log lines which I was surprised came in as text and not JSON after configuring envoy_log_type: json
. They were intermingled with JSON access log message and didn't parse.
2020-10-09 14:37:24 diagd 1.5.1 [P383TThreadPoolExecutor-0_8] INFO: 9B8F4961-FB07-4012-A5E9-484B8A520C3D: <ipaddress> "GET /metrics" 124ms 200 success
[libprotobuf ERROR external/com_google_protobuf/src/google/protobuf/wire_format_lite.cc:578] String field 'google.protobuf.Value.string_value' contains invalid UTF-8 data when serializing a protocol buffer. Use the 'bytes' type if you intend to send raw bytes.
E1008 21:09:38.924448 390 reflector.go:123] k8s.io/[email protected]/tools/cache/reflector.go:96: Failed to list <nil>: Get https://<ipaddress>:443/apis/getambassador.io/v2/namespaces/<namespace>/kubernetesendpointresolvers?limit=500&resourceVersion=0: dial tcp <ipaddress>:443: connect: no route to host
Trace[934878924]: [30.000552118s] [30.000552118s] END
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
afaik this is still an issue
looks like error log and many other logs still use text format
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
if this issue has not yet been fixed, then it is not stale; it is active.
FWIW, we send all our logs to Datadog. If the logs are NOT JSON formatted, then Datadog treats them as level=ERROR
, and it is not ideal to see thousands of error logs from Ambassador every hour.
We have done the following to help mitigate a lot of the noisy "error" logs from Ambassador.
- Use
envoy_log_type: json
in our ambassador module configuration - Create a custom Datadog log pipeline for the diagd log messages. Using a grok parser the one below did the trick for us.
aes.diagd %{date("yyyy-MM-dd HH:mm:ss"):date} diagd %{regex("[0-9.]*"):version} \[%{data:process_thread}\] %{word:level}: %{data:message}
We still have quite a few "snapshot" activity logs that we would also like to format as JSON. Creating a custom log pipeline in Datadog would not be ideal, because there is not much of a pattern for these messages and there is no log level in them. (And we'd rather treat the disease, not the symptom).
Examples:
PID 160, 0.06Gi (exited): envoy --config-path /ambassador/snapshots/econf-tmp.json --mode validate
PID 37, 0.07Gi: /usr/bin/python /usr/bin/diagd /ambassador/snapshots /ambassador/bootstrap-ads.json /ambassador/envoy/envoy.json --notices /ambassador/notices.json --port 8004 --kick kill -HUP 1
PID 19, 0.14Gi: /ambassador/sidecars/amb-sidecar
2021/04/19 18:40:13 Memory Usage 1.63Gi (41%)
2021/04/19 18:40:01 Taking K8sSecret snapshot.SecretRef{Namespace:"ambassador", Name:"fallback-self-signed-cert"}
We ran into this exact situation and it is making our logs look terrible and can trigger alerts. Is there any movement on fixing this? I feel like this should be a critical bug.
I think there are 2 bandaid solutions that might be relevant here:
-
Set
AES_LOG_LEVEL: error
in the container's env variables (https://www.getambassador.io/docs/edge-stack/latest/topics/running/environment/). Assuming you're running later than 1.5.4, this will minimize the vast majority of logs produced by the Ambassador control plane. -
Set a separate envoy log path from stdout using the
Module
(https://www.getambassador.io/docs/edge-stack/latest/topics/running/ambassador/#envoy). Then, you can set datadog to grab from your envoy log path for access logs, and then do something completely separate for stdout, which will now only contain control plane logs.
I'm admittedly pretty ignorant with respect to how datadog handles log gathering, so I'm not sure if # 2 will actually work the way I envision, so maybe someone with more experience might be able to expand on that idea.
I think 2nd option can work but the file has be mounted into the agent container. I am going to try option 1. All we really need is error logs and envoy traffic that we can link with traces.
For anyone else it seems with AES_LOG_LEVEL: error
you still get memory usage logs.
Thanks for everyone's help. And hope this helps the OP too.
I worked on https://github.com/datawire/ambassador/pull/3437 recently and was able to address JSON logging for most of the Ambassador control plane code. Unfortunately, it still doesn't cover the logging emitted by our Kubernetes client-go dependencies and it probably won't cover the protobuf CPP warning/error messages. I think those might be coming from within Envoy... I'd need to dig deeper.
This change should however improve life significantly for those that want "almost all JSON logs" and should reduce the noise to a manageable minimum until we can track down and address the remaining rogue logs.