opentelemetry-ruby icon indicating copy to clipboard operation
opentelemetry-ruby copied to clipboard

Encoding Errors when exporting ASCII-8BIT encoded Span Attributes

Open arielvalentin opened this issue 2 months ago • 2 comments

Should the OTel SDK force users to encode strings as UTF-8 before using them in user defined properties like the Names or Attributes<String> or should force encode strings as UTF-8?

A little background

The Encoding::UndefinedConversionError exception is raised in the OTLP Exporter when creating a Opentelemetry::Proto::Common::V1::AnyValue#string_value and the string values are encoded as ASCII-8BIT.

I am still working on tracking this down but based on the error message we see coming from the error handler, we know it is likely because either the web server or rack is reading the HTTP headers with ASCII-8BIT encoding:

Encoding::UndefinedConversionError "\xE5" from ASCII-8BIT to UTF-8

encoding error for key http.user_agent and value WordPress/13.1.4(by���������and���������99���88); https://<redacted>

Reproducing the Problem

One of the characters in the header is , so I have put together a short demonstration of the problem:

irb(main):104> st = "张"
=> "张"
irb(main):105> st.encoding
=> #<Encoding:UTF-8>
irb(main):106> st.force_encoding(Encoding::ASCII_8BIT)
=> "\xE5\xBC\xA0"
irb(main):107> av = Opentelemetry::Proto::Common::V1::AnyValue.new
=> <Opentelemetry::Proto::Common::V1::AnyValue: string_value: "", bool_value: false, int_value: 0, double_value: 0.0, array_value: nil, kvlist_value: nil, bytes_value: "">
irb(main):108> av.string_value = st
(irb):108:in `method_missing': "\xE5" from ASCII-8BIT to UTF-8 (Encoding::UndefinedConversionError)
        from (irb):108:in `<main>'
        from <internal:kernel>:187:in `loop'
        from script/console:82:in `<main>'

If the user supplies a string that is encoded as UTF-8 then it works as expected:

irb(main):109> st = "张"
=> "张"
irb(main):110> av = Opentelemetry::Proto::Common::V1::AnyValue.new
=> <Opentelemetry::Proto::Common::V1::AnyValue: string_value: "", bool_value: false, int_value: 0, double_value: 0.0, array_value: nil, kvlist_value: nil, bytes_value: "">
irb(main):111> av.string_value = st
=> "张"

With Forced Encoding to UTF-8 it works:

irb(main):109> st = "张"
=> "张"
irb(main):110> av = Opentelemetry::Proto::Common::V1::AnyValue.new
=> <Opentelemetry::Proto::Common::V1::AnyValue: string_value: "", bool_value: false, int_value: 0, double_value: 0.0, array_value: nil, kvlist_value: nil, bytes_value: "">
irb(main):111> av.string_value = st
=> "张"
irb(main):112> st.force_encoding(Encoding::ASCII_8BIT)
=> "\xE5\xBC\xA0"
irb(main):113> av = Opentelemetry::Proto::Common::V1::AnyValue.new
=> <Opentelemetry::Proto::Common::V1::AnyValue: string_value: "", bool_value: false, int_value: 0, double_value: 0.0, array_value: nil, kvlist_value: nil, bytes_value: "">
irb(main):114> av.string_value = st.force_encoding(Encoding::UTF_8)
=> "张"

What does DD Trace do?

It force encodes keys and values to UTF-8 if necessary:

https://github.com/DataDog/dd-trace-rb/blob/4b1901101c0f4d1033558bd9c0ff2e1c2b963bc7/lib/datadog/tracing/metadata/tagging.rb#L52 https://github.com/DataDog/dd-trace-rb/blob/4b1901101c0f4d1033558bd9c0ff2e1c2b963bc7/lib/datadog/core/utils.rb#L47

The full backtrace


/build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:380:in `method_missing': "\xE5" from ASCII-8BIT to UTF-8 (Encoding::UndefinedConversionError)
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:380:in `as_otlp_any_value'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:369:in `as_otlp_key_value'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:319:in `block in as_otlp_span'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:319:in `each'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:319:in `map'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:319:in `as_otlp_span'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:292:in `block (3 levels) in encode'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:292:in `map'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:292:in `block (2 levels) in encode'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:286:in `each'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:286:in `map'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:286:in `block in encode'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:279:in `each'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:279:in `map'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:279:in `encode'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-exporter-otlp-0.27.0/lib/opentelemetry/exporter/otlp/exporter.rb:80:in `export'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-sdk-1.4.1/lib/opentelemetry/sdk/trace/export/batch_span_processor.rb:188:in `block in export_batch'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-sdk-1.4.1/lib/opentelemetry/sdk/trace/export/batch_span_processor.rb:188:in `synchronize'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-sdk-1.4.1/lib/opentelemetry/sdk/trace/export/batch_span_processor.rb:188:in `export_batch'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-sdk-1.4.1/lib/opentelemetry/sdk/trace/export/batch_span_processor.rb:170:in `block in work'
	from <internal:kernel>:187:in `loop'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-sdk-1.4.1/lib/opentelemetry/sdk/trace/export/batch_span_processor.rb:159:in `work'
	from /build/vendor/gems/3.3.3/ruby/3.3.0/gems/opentelemetry-sdk-1.4.1/lib/opentelemetry/sdk/trace/export/batch_span_processor.rb:180:in `block in reset_on_fork'

arielvalentin avatar Jun 21 '24 23:06 arielvalentin