dd-trace-rb icon indicating copy to clipboard operation
dd-trace-rb copied to clipboard

[BUG]: Datadog APM Not Propagating Traces for Sidekiq Pro Batch Callbacks

Open sundars1995 opened this issue 9 months ago • 2 comments

Tracer Version(s)

2.12.1

Ruby Version(s)

ruby 3.1.6p260 (2024-05-29 revision a777087be6) [arm64-darwin24]

Relevent Library and Version(s)

sidekiq 6.0, sidekiq pro

Bug Report

We are integrating Datadog APM tracing into our Rails application, which makes heavy use of Sidekiq Pro batch jobs. The default Sidekiq tracing instrumentation works well for most jobs, but we are encountering issues with tracing Sidekiq Pro batch callbacks (on_successon_complete).

Observed Issue

  • When a Sidekiq batch completes, its callbacks (such as on_success and on_complete) are treated as separate traces rather than being linked to the parent trace.

  • This is especially problematic because our batch jobs have deep nesting, where callbacks can create new batches and trigger further callbacks.

  • The result is that the entire process is fragmented into multiple traces, making it difficult to correlate jobs under a single trace.

Expected Behavior

  • The callback jobs should inherit and propagate the parent batch trace, ensuring all related jobs are under a single distributed trace.

Reproduction Code

Reproduction Workflow

  1. parent job creates a Sidekiq Pro batch job.

  2. The batch job enqueues multiple nested jobs.

  3. Upon batch completion, an **on_complete** callback is triggered.

  4. This callback schedules another batch job, which follows the same pattern.

  5. Each batch's callback job is treated as a new trace, breaking the trace propagation.

Parent Job
   ├── Batch 1
   │      ├── Nested Job A
   │      ├── Nested Job B
   │      └── (on_complete Callback → Creates Batch 2)
   │
   ├── Batch 2
   │      ├── Nested Job C
   │      ├── Nested Job D
   │      └── (on_complete Callback → Creates Batch 3)
   │
   ├── Batch 3
   │      ├── Nested Job E
   │      ├── Nested Job F
   │      └── (on_complete Callback → ...)
   └── (on_complete Callback -> )

Configuration Block

Datadog.configure do |c|
  c.tracing.instrument :sidekiq, distributed_tracing: true
end

Error Logs

No response

Operating System

No response

How does Datadog help you?

Our company currently uses datadog extensively for logging and we are instrumenting APM traces along with RUM for distributed tracing for all our applications.

sundars1995 avatar Mar 12 '25 13:03 sundars1995

Thank you so much for the report, @sundars1995!

Let me record this request and start process of getting access to Sidekiq Pro so we can test this.

marcotc avatar Mar 27 '25 18:03 marcotc

Happy to help here. Callbacks are not technically part of the batch, they run “after” the batch but I can see how it would be useful to group them together. We’ll investigate.

Also, Sidekiq 6 is no longer supported, I’d suggest you upgrade to 7 at your convenience.

mperham avatar Mar 28 '25 01:03 mperham