opentelemetry-ruby icon indicating copy to clipboard operation
opentelemetry-ruby copied to clipboard

Forked Process Resource Attributes Are Missing

Open arielvalentin opened this issue 2 years ago • 4 comments

Description of the bug

Spans produced by a child process are missing process resource attributes^1. The process.pid is always set to the parent process id and the process.parent_id is missing. See the output below for an example:

Parent Resources Parent PID: 26534

{"service.name"=>"unknown_service", "process.pid"=>26534, "process.command"=>"stale-pid.rb", "process.runtime.name"=>"ruby", "process.runtime.version"=>"3.1.3", "process.runtime.description"=>"ruby 3.1.3p185 (2022-11-24 revis
ion 1a6b16756e) [arm64-darwin21]", "telemetry.sdk.name"=>"opentelemetry", "telemetry.sdk.language"=>"ruby", "telemetry.sdk.version"=>"1.2.0"}

Forked Resources Child PID: 26551 Parent PID: 26534

{"service.name"=>"unknown_service", "process.pid"=>26534, "process.command"=>"stale-pid.rb", "process.runtime.name"=>"ruby", "process.runtime.version"=>"3.1.3", "process.runtime.description"=>"ruby 3.1.3p185 (2022-11-24 revis
ion 1a6b16756e) [arm64-darwin21]", "telemetry.sdk.name"=>"opentelemetry", "telemetry.sdk.language"=>"ruby", "telemetry.sdk.version"=>"1.2.0"}

Share details about your runtime

Operating system details: Linux, Ubuntu 20.04 LTS RUBY_ENGINE: "ruby" RUBY_VERSION: "3.1.3" RUBY_DESCRIPTION: "ruby 3.1.3p185 (2022-11-24 revision 1a6b16756e) [arm64-darwin21]"

Share a simplified reproduction if possible

#!/usr/bin/env ruby
# frozen_string_literal: true

# Copyright The OpenTelemetry Authors
#
# SPDX-License-Identifier: Apache-2.0
ENV['OTEL_TRACES_EXPORTER'] ||= 'console'

require 'bundler/inline'

gemfile(true) do
  source 'https://rubygems.org'
  gem 'opentelemetry-sdk'
end

# Export traces to console by default

OpenTelemetry::SDK.configure
at_exit do
  OpenTelemetry.tracer_provider.shutdown
end

tracer = OpenTelemetry.tracer_provider.tracer('example', '1.0')

puts "Parent Resources Parent PID: #{Process.pid}"
puts OpenTelemetry.tracer_provider.resource.attribute_enumerator.to_h

child_pid = fork do
  puts "Forked Resources Child PID: #{Process.pid} Parent PID: #{Process.ppid}"
  puts OpenTelemetry.tracer_provider.resource.attribute_enumerator.to_h
end
Process.wait(child_pid)

tracer.in_span('parent-process') do |span|
  child_process_pid = Process.fork do
    tracer.in_span('forked-process') do |forked_span|
      forked_span.add_attributes(
        'parent.process.pid' => Process.ppid,
        'forked.process.pid' => Process.pid
      )
      sleep 1
      puts "child, pid #{Process.pid} exiting..."
    end
  end

  span.add_attributes(
    'parent.process.pid' => Process.pid,
    'forked.process.pid' => child_process_pid
  )
  puts "parent, pid #{Process.pid}, waiting on child pid #{child_process_pid}"
  Process.wait(child_process_pid)
  puts 'parent exiting'
end

arielvalentin avatar Jan 19 '23 13:01 arielvalentin

Sadly, the fix here is "configure OpenTelemetry after forking". Reconfiguring is hard-or-unsupported, IIRC. 🤔

fbogsany avatar Jan 19 '23 17:01 fbogsany

Assuming you have some kind of after_fork hook available in your application framework (e.g. in Rails, Unicorn, Puma, Resque, etc.), then you can do something like:

OpenTelemetry.tracer_provider = OpenTelemetry::SDK::Trace::TracerProvider.new(
  sampler: OpenTelemetry.tracer_provider.sampler,
  id_generator: OpenTelemetry.tracer_provider.id_generator,
  span_limits: OpenTelemetry.tracer_provider.span_limits,
  resource: OpenTelemetry.tracer_provider.resource.merge(OpenTelemetry::SDK::Resources::Resource.process),
)

The caveat, though, is that only new Tracer instances will pick up the change.

fbogsany avatar Jan 19 '23 18:01 fbogsany

We're updating a couple of attributes (process.pid and yjit_resumed) after forking, including after re-forking. We added a helper to our wrapper gem:

    # Updates the global resource with the provided attributes and the current process resource.
    #
    # This is intended to be used primarily after forking to update the resource 'process.pid'
    # attribute to the new process id. It can also be used to update other attributes, such as
    # 'yjit_resumed' to indicate that the process is running with YJIT enabled.
    def update_resource(attributes = {})
      resource = OpenTelemetry
        .tracer_provider
        .resource
        .merge(OpenTelemetry::SDK::Resources::Resource.process)
        .merge(OpenTelemetry::SDK::Resources::Resource.create(attributes))
      OpenTelemetry.tracer_provider.instance_variable_set(:@resource, resource)
    end

We considered monkey-patching the SDK TracerProvider with attr_writer :resource to complement the existing attr_reader :resource, but given we funnel everything through this helper anyway, it didn't seem warranted.

In Ruby 3.1+, we can then use the Process._fork hook:

  module ForkHook
    def _fork
      ret = super
      if ret == 0 && OpenTelemetry.tracer_provider.respond_to?(:resource)
        OpenTelemetry::Shopify.update_resource
      end
      ret
    end
  end
  Process.singleton_class.prepend(ForkHook)

The respond_to?(:resource) guard ensures we don't call the helper if the global tracer_provider is not a SDK tracer provider. 🤔 Arguably, that guard should be moved into the helper.

fbogsany avatar Feb 01 '24 15:02 fbogsany

Yeah, pretty much what @fbogsany said 😄 (ended up moving the guard clause to update_resource).

Relevant docs for those interested, which include some informative caveats.

plantfansam avatar Feb 01 '24 22:02 plantfansam

👋 This issue has been marked as stale because it has been open with no activity. You can: comment on the issue or remove the stale label to hold stale off for a while, add the keep label to hold stale off permanently, or do nothing. If you do nothing this issue will be closed eventually by the stale bot.

github-actions[bot] avatar Mar 10 '24 01:03 github-actions[bot]