dd-trace-rb icon indicating copy to clipboard operation
dd-trace-rb copied to clipboard

[BUG]: Karafka's "worker.process" trace is lost after iterating the messages (when distributed tracing is enabled)

Open Drowze opened this issue 3 months ago • 2 comments

Tracer Version(s)

2.19.0

Ruby Version(s)

ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin23]

Relevent Library and Version(s)

karafka 2.5.0

Bug Report

When using the Karafka integration with distributed_tracing, the original trace is lost once we start iterating through the messages.

Reproduction Code

  1. First make sure Kafka (or a compatible platform) is running.
# I'm using Redpanda, a Kafka-compatible platform
$ docker run --rm -p 9092:9092 docker.redpanda.com/redpandadata/redpanda
  1. Then, on a brand new rails repo:
# add karafka and datadog gems
$ bundle add karafka datadog
# config/initializers/datadog.rb
Datadog.configure do |c|
  c.tracing.instrument :karafka, distributed_tracing: true, enabled: true
end

# karafka.rb
class KarafkaApp < Karafka::BaseConsumer
  setup do |config|
    config.kafka = { 'bootstrap.servers': '127.0.0.1:9092' }
  end

  routes.draw do
    topic :example do
      consumer ExampleConsumer
    end
  end
end

# app/consumers/example_consumer.rb
class ExampleConsumer < Karafka::BaseConsumer
  def consume
    log("before #each")

    messages.each do |message|
      log("inside #each")
      puts "doing some hard work!"
    end

    log("after #each")
  end

  def log(description)
    trace = Datadog::Tracing.active_trace
    span = Datadog::Tracing.active_span

    data = {id: trace&.id, name: trace&.name, resource: trace&.resource, parent_span_id: trace&.parent_span_id}
    puts "[consumer - #{description}] trace: #{data}"

    data = {id: span&.id, name: span&.name, resource: span&.resource, parent_id: span&.parent_id}
    puts "[consumer - #{description}] span: #{data}"
  end
end

# producing messages in a terminal
["msg1", "msg2"].each do |msg|
  Datadog::Tracing.trace("producer.#{msg}") do
    digest = Datadog::Tracing.active_trace.to_digest
    headers = {}
    Datadog::Tracing::Contrib::Karafka.inject(digest, headers)
    Karafka.producer.produce_async({
      topic: "example", payload: { foo: msg }.to_json, headers: headers
    })
  end
end

Result of the above code (with some colors to help visualizing). Note that the consumer - after #each lines don't have an active trace/span. Image

Error Logs

No response

Operating System

No response

How does Datadog help you?

No response

Drowze avatar Aug 29 '25 01:08 Drowze

Hi @Drowze , I am attempting to run your reproducer but it's missing some requires. Karafka and ApplicationConsumer are undefined for example.

p-datadog avatar Sep 16 '25 16:09 p-datadog

Hey @p-datadog 👋

Apologies - I've left ApplicationConsumer there but you can replace it for Karafka::BaseConsumer. Also I haven't really mentioned but you should:

  • make sure Apache Kafka (or a compatible platform) is running and listening on port 9092
  • add the karafka gem and spawn the Karafka server in one terminal before producing the messages in another one

(I've just updated the issue to reflect the information above)

I've just setup now a small repro, here's the relevant commit (any code before this commit was generated by rails new): https://github.com/Drowze/rails_3.4.1__8.0.2.1_datadog_4873/commit/664394120759388e9ca4754359416966b85e380b

Running it locally: Image

Drowze avatar Sep 16 '25 18:09 Drowze