dd-trace-rb icon indicating copy to clipboard operation
dd-trace-rb copied to clipboard

[BUG]: current trace is missing when in a new Fiber (e.g. `Enumerator.new`)

Open agrobbin opened this issue 2 months ago • 0 comments

Tracer Version(s)

2.22.0

Ruby Version(s)

ruby 3.4.7p58

Relevent Library and Version(s)

openai

Bug Report

This library uses Thread.current for storing / accessing the current trace / span, which works in 99% of cases. However, the one place this doesn't work as expected is when fibers are used in very specific ways, such as with Enumerator.new and external iteration. Unluckily for us, OpenAI's internal request execution logic uses exactly that (source).

Because of that, we cannot rely on Thread.current to provide us with the current trace, since that store is actually "Fiber-local", not "Thread-local", and it does not inherit to "child" fibers. "Fiber storage variables" on the other hand, are inherited and are designed to handle an Enumerator-created fiber.

In our case, this was noticed specifically with the OpenAI Ruby SDK, but seems to be an issue in general whenever logic that is traced is executed within an Enumerator.new block.

A minimal reproduction that illustrates the issue:

Thread.current[:foo] = 123

e = Enumerator.new do |y|
  y << Thread.current[:foo]
end

e.next
# => nil

We've worked around this by patching the OpenAI request transport class to use a "Fiber storage variable":

module PooledNetPatcherPatch
  DATADOG_TRACE_DIGEST_FIBER_KEY = :_openai_datadog_trace_digest

  def execute(...)
    Fiber[DATADOG_TRACE_DIGEST_FIBER_KEY] = Tracing.active_trace&.to_digest

    super
  ensure
    Fiber[DATADOG_TRACE_DIGEST_FIBER_KEY] = nil
  end

  private

  def with_pool(...)
    Tracing.continue_trace!(Fiber[DATADOG_TRACE_DIGEST_FIBER_KEY])

    super
  end
end

OpenAI::Internal::Transport::PooledNetRequester.prepend(PooledNetPatcherPatch)

I wasn't really sure if this was something that could be easily fixed, but it seemed worth raising to at least start a discussion and share our current solution!

Reproduction Code

No response

Configuration Block

No response

Error Logs

No response

Operating System

No response

How does Datadog help you?

No response

agrobbin avatar Dec 07 '25 00:12 agrobbin