SemanticLogger integration not correlating logs with server/environment
Current behaviour SemanticLogger integration is not correlating logs with server/environment (configured via DD_SERVICE/DD_ENV)
Expected behaviour
Given the quote below, I'd expect rails_semantic_logger to just work with Datadog and correlate the logs with their traces and correct env/service (i.e. the ones supplied by the DD_ENV and DD_SERVICE environment variables)
For Rails applications using the default logger (ActiveSupport::TaggedLogging), lograge, or semantic_logger, trace correlation injection is automatically configured. from: https://docs.datadoghq.com/tracing/other_telemetry/connect_logs_and_traces/ruby/
Steps to reproduce
- Started a Datadog docker agent using docker-compose:
dd-agent: image: gcr.io/datadoghq/agent:7 pid: host environment: - DD_ENV=rafael-testing - DD_API_KEY=${DD_API_KEY} - DD_LOGS_ENABLED=true ports: - 8126:8126 - 10518:10518 volumes: - /var/run/docker.sock:/var/run/docker.sock:ro - /var/lib/docker/containers:/var/lib/docker/containers:ro - ${PWD}/dd-agent/logs.yaml:/conf.d/logs.yaml - My mounted logs configuration:
logs: - type: tcp port: 10518 service: log-echo source: ruby - On a Rails server (on the host machine, not a container) I added the gems
rails_semantic_loggerandnet_tcp_client - On datadog configuration I added:
c.tracing.instrument :rails c.tracing.log_injection = true - On
config/development.rbI got:config.log_level = ENV.fetch("LOG_LEVEL", "info").downcase.strip.to_sym config.rails_semantic_logger.add_file_appender = false config.semantic_logger.add_appender(appender: :tcp, server: "localhost:10518") - On a Rails route, I just added a simple
Rails.logger.info("Hello world") - Finally, I started the Rails server with:
DD_SERVICE=auth-user DD_VERSION=after-sl DD_ENV=rafael-testing2 bundle exec rails server - Result:

Environment
- ddtrace version: 1.5.1
- Ruby version:
- Operating system: 3.1.2
- Relevant library versions: net_tcp_client (2.2.0), rails_semantic_logger (4.10.0), semantic_logger (4.12.0)
Some further testing, this time NOT using any Ruby process. With only a datadog agent with the following logs config:
logs:
- type: tcp
port: 10518
service: log-echo
source: ruby
-
A JSON log with
ddon the root object.echo '{ "dd": { "trace_id": "754516775358317257", "env": "rafael-testing", "service": "auth-user", "version":42 }, "ddsource": "ruby-overwrite", "message": "HI DAWG" }' | jq -c | nc localhost 10518trace reported ✅ environment reported ✅ service reported (overwritten) ✅ version reported ✅ source: ignored (not overwritten) :x:
-
A JSON log with
ddinside anamed_tagsobject (this is how SemanticLogger does it by default)echo '{ "named_tags": { "dd": { "trace_id": "754516775358317257", "env": "rafael-testing", "service": "auth-user", "version":42 }, "ddsource": "ruby-overwrite" }, "message": "HI DAWG" }' | jq -c | nc localhost 10518trace reported ✅ environment ignored :x: service ignored(not overwritten) :x: version ignored :x: source ignored (not overwritten) :x:
-
A JSON log with
ddinside an object with a name different thannamed_tagsecho '{ "weirdly_named": { "dd": { "trace_id": "754516775358317257", "env": "rafael-testing", "service": "auth-user", "version":42 }, "ddsource": "ruby-overwrite" }, "message": "HI DAWG" }' | jq -c | nc localhost 10518trace ignored :x: environment ignored :x: service ignored(not overwritten) :x: version ignored :x: source ignored (not overwritten) :x:
-
A formatted plain string
echo '[dd.env=rafael-testing dd.service=auth-user dd.version=42 dd.source=ruby-overwrite dd.trace_id=754516775358317257] HI DAWG' | nc localhost 10518trace reported ✅ environment reported ✅ service reported (overwritten) ✅ version reported ✅ source: ignored (not overwritten) :x:
Seeing the above results, some points:
- Is
ddsource/dd.sourcereally necessary on the log string? At least on my setup it felt a bit useless. trace_idseems to be the only correctly detected attribute inside anamed_tagsobject. Can Datadog correctly detect other attributes as well? It seems to be a bug that it doesn't.
👋 @Drowze Thanks for reporting, I will take a look at this issue.
👋 @Drowze , I want to get back to you about this issue.
I am still communicating with our internal backend team to figure out the culprit that is causing the correlation issue with semantic logger.
So far, with semantic logger format, only correlation for trace_id and span_id in the named_tags are implemented. service and source from agent configuration also seems to be implemented.
I will keep you posted with my research and make changes for tracer if necessary. 😉
@Drowze Sorry I have not made too much progress on this.
Could you file for a ticket about this issue? This would go through layers of triage from our support engineers(and to track this), would eventually find the owner of this.
https://help.datadoghq.com/hc/en-us/requests/new
You could simply referred to this Github issue, and tagged me for suggesting doing so.
Any updates around this?
We're noticing an issue in our setup that may be similar but possibly distinct.
We're in a containerized environment and the SemanticLogger.host attribute is picking up our pod name and propagating that up as the host value in our logs. (i.e. a JSON log starts with { host: 'pod-name-example' ....
This has caused us correlation headaches where we now have significant deltas in how logs from a single pod are reported because the majority of logs are application logs via rails_semantic_logger including the invalid hostname.
It looks like datadog stream-logs has the proper host information associated at the agent-level, but the log's value is overriding it. To get around this, I need a custom formatter that looks like:
module SemanticLogger
module Formatters
class HostlessJson < SemanticLogger::Formatters::Raw
def initialize(time_format: :iso_8601, time_key: :timestamp, **args)
super(time_format: time_format, time_key: time_key, **args)
end
# NOTE: `#call` is where all of the changes live. This is basically a carbon copy of SemanticLogger::Formatters::Json
# formatter but with the `h.delete(:host)` call.
def call(log, logger)
log_hash = super(log, logger).tap do |h|
h.delete :host
end.to_json
end
end
end
end
I think this could equally be seen as a Datadog or a SemanticLogger issue since SemanticLogger.host is implemented in such a way that it's always non-nil, but the Datadog instrumentation is implemented via a patch, rather than a custom appender but the impact this has on us is greater on the Datadog side, as a customer
Edit: It turns out SemanticLogger has an API for this behavior if you specify a custom formatter.
# config/environments/production.rb
log_formatter = SemanticLogger::Formatters::Json.new(
log_host: false # ✨✨✨
)
config.semantic_logger.add_appender(io: STDOUT, formatter: log_formatter)
I see that this issue was opened 2 years ago, does it mean semantic_logger integration is still broken or will we see trace_id and span_id in logs and proper correlation with default semantic_logger setup and dd c.tracing.log_injection = true ?
👋 Sorry for the delay here. I tried a very simple local app outputting SemanticLogger log lines, and I can't get the any sort of rich parsing from the Logs product:
# I get "No attributes have been extracted from the log message." for
SemanticLogger.add_appender(io: log_output)
# I don't get the log line at all for
SemanticLogger.add_appender(io: log_output, formatter: :json)
# For reference, my structured message looks like this in memory:
{"host":"my-host","application":"Semantic Logger","timestamp":"2024-11-08T19:43:49.177537Z","level":"info","level_index":2,"pid":1234,"thread":"132","named_tags":{"dd":{"env":"production","service":"MyService","version":"1.2.3","trace_id":"1039570011634864780","span_id":"1469129669426488971"},"ddsource":"ruby"},"name":"TestClass","message":"My Message!"}
I'll try to setup a basic Rails app from scratch, then add SemanticLogger, to ensure that I'm using a realistic setup, instead of manually using the SemanticLogger API which is not a common application configuration.
I use SemanticLogger at work with datadog 2.x (we're running on 2.5 in prod currently).
rails_semantic_logger + datadog gems work as expected.
In config/environments/production.rb there are a couple of additional lines to log to STDOUT via JSON and you can use config.rails_semantic_logger.format = :json in config/environments/development.rbto getlogs/development.log` printing in JSON under RAILS_ENV=development.
Trace correlation works out of the box for us with the current SemanticLogger integration BUT I wanted to post that bit about host correlation because it was a gotcha for us which caused DD logs to not correlate to their host infra properly until I dropped the host value from the payload.
I was able to refactor it to be more simple by using:
config.semantic_logger.add_appender(
io: STDOUT,
formatter: SemanticLogger::Formatters::Json.new(log_host: false)
)
Thank you, @tylerwillingham.
I'll play around with the log_host option and document the incompatibility in a clear way (maybe emitting a warning for it).