opentelemetry-ruby
opentelemetry-ruby copied to clipboard
Forked Process Resource Attributes Are Missing
Description of the bug
Spans produced by a child process are missing process resource attributes^1. The process.pid
is always set to the parent process id and the process.parent_id
is missing. See the output below for an example:
Parent Resources Parent PID: 26534
{"service.name"=>"unknown_service", "process.pid"=>26534, "process.command"=>"stale-pid.rb", "process.runtime.name"=>"ruby", "process.runtime.version"=>"3.1.3", "process.runtime.description"=>"ruby 3.1.3p185 (2022-11-24 revis
ion 1a6b16756e) [arm64-darwin21]", "telemetry.sdk.name"=>"opentelemetry", "telemetry.sdk.language"=>"ruby", "telemetry.sdk.version"=>"1.2.0"}
Forked Resources Child PID: 26551 Parent PID: 26534
{"service.name"=>"unknown_service", "process.pid"=>26534, "process.command"=>"stale-pid.rb", "process.runtime.name"=>"ruby", "process.runtime.version"=>"3.1.3", "process.runtime.description"=>"ruby 3.1.3p185 (2022-11-24 revis
ion 1a6b16756e) [arm64-darwin21]", "telemetry.sdk.name"=>"opentelemetry", "telemetry.sdk.language"=>"ruby", "telemetry.sdk.version"=>"1.2.0"}
Share details about your runtime
Operating system details: Linux, Ubuntu 20.04 LTS RUBY_ENGINE: "ruby" RUBY_VERSION: "3.1.3" RUBY_DESCRIPTION: "ruby 3.1.3p185 (2022-11-24 revision 1a6b16756e) [arm64-darwin21]"
Share a simplified reproduction if possible
#!/usr/bin/env ruby
# frozen_string_literal: true
# Copyright The OpenTelemetry Authors
#
# SPDX-License-Identifier: Apache-2.0
ENV['OTEL_TRACES_EXPORTER'] ||= 'console'
require 'bundler/inline'
gemfile(true) do
source 'https://rubygems.org'
gem 'opentelemetry-sdk'
end
# Export traces to console by default
OpenTelemetry::SDK.configure
at_exit do
OpenTelemetry.tracer_provider.shutdown
end
tracer = OpenTelemetry.tracer_provider.tracer('example', '1.0')
puts "Parent Resources Parent PID: #{Process.pid}"
puts OpenTelemetry.tracer_provider.resource.attribute_enumerator.to_h
child_pid = fork do
puts "Forked Resources Child PID: #{Process.pid} Parent PID: #{Process.ppid}"
puts OpenTelemetry.tracer_provider.resource.attribute_enumerator.to_h
end
Process.wait(child_pid)
tracer.in_span('parent-process') do |span|
child_process_pid = Process.fork do
tracer.in_span('forked-process') do |forked_span|
forked_span.add_attributes(
'parent.process.pid' => Process.ppid,
'forked.process.pid' => Process.pid
)
sleep 1
puts "child, pid #{Process.pid} exiting..."
end
end
span.add_attributes(
'parent.process.pid' => Process.pid,
'forked.process.pid' => child_process_pid
)
puts "parent, pid #{Process.pid}, waiting on child pid #{child_process_pid}"
Process.wait(child_process_pid)
puts 'parent exiting'
end
Sadly, the fix here is "configure OpenTelemetry after forking". Reconfiguring is hard-or-unsupported, IIRC. 🤔
Assuming you have some kind of after_fork
hook available in your application framework (e.g. in Rails, Unicorn, Puma, Resque, etc.), then you can do something like:
OpenTelemetry.tracer_provider = OpenTelemetry::SDK::Trace::TracerProvider.new(
sampler: OpenTelemetry.tracer_provider.sampler,
id_generator: OpenTelemetry.tracer_provider.id_generator,
span_limits: OpenTelemetry.tracer_provider.span_limits,
resource: OpenTelemetry.tracer_provider.resource.merge(OpenTelemetry::SDK::Resources::Resource.process),
)
The caveat, though, is that only new Tracer
instances will pick up the change.
We're updating a couple of attributes (process.pid
and yjit_resumed
) after forking, including after re-forking. We added a helper to our wrapper gem:
# Updates the global resource with the provided attributes and the current process resource.
#
# This is intended to be used primarily after forking to update the resource 'process.pid'
# attribute to the new process id. It can also be used to update other attributes, such as
# 'yjit_resumed' to indicate that the process is running with YJIT enabled.
def update_resource(attributes = {})
resource = OpenTelemetry
.tracer_provider
.resource
.merge(OpenTelemetry::SDK::Resources::Resource.process)
.merge(OpenTelemetry::SDK::Resources::Resource.create(attributes))
OpenTelemetry.tracer_provider.instance_variable_set(:@resource, resource)
end
We considered monkey-patching the SDK TracerProvider
with attr_writer :resource
to complement the existing attr_reader :resource
, but given we funnel everything through this helper anyway, it didn't seem warranted.
In Ruby 3.1+, we can then use the Process._fork
hook:
module ForkHook
def _fork
ret = super
if ret == 0 && OpenTelemetry.tracer_provider.respond_to?(:resource)
OpenTelemetry::Shopify.update_resource
end
ret
end
end
Process.singleton_class.prepend(ForkHook)
The respond_to?(:resource)
guard ensures we don't call the helper if the global tracer_provider
is not a SDK tracer provider. 🤔 Arguably, that guard should be moved into the helper.
Yeah, pretty much what @fbogsany said 😄 (ended up moving the guard clause to update_resource
).
Relevant docs for those interested, which include some informative caveats.
👋 This issue has been marked as stale because it has been open with no activity. You can: comment on the issue or remove the stale label to hold stale off for a while, add the keep
label to hold stale off permanently, or do nothing. If you do nothing this issue will be closed eventually by the stale bot.