dd-trace-rb
dd-trace-rb copied to clipboard
Profiling native extension does not detect libdatadog upgrades/downgrades
This is a bit of a corner case that occurred to me the other day, and I want to track it so that we don't forget about it.
TL;DR: The workaround for this is to reinstall ddtrace
after changing libdatadog
versions.
Current behaviour:
Because ddtrace
compiles and links against libdatadog
at installation time, it becomes "bound" to the libdatadog
that was available at that time, and does not respect any changes that are made after that.
Consider this gems.rb
file:
source 'https://rubygems.org'
gem 'google-protobuf'
gem 'ddtrace'
gem 'libdatadog', '= 0.7.0.1.0'
and a Ruby installation that has no libdatadog
or ddtrace
version installed:
root@c7311b47f69d:/app/libdatadog-detect-missing# gem uninstall libdatadog ddtrace
Gem 'libdatadog' is not installed
Gem 'ddtrace' is not installed
Now let's run bundle install
:
root@c7311b47f69d:/app/libdatadog-detect-missing# bundle install
Fetching gem metadata from https://rubygems.org/...
Resolving dependencies...
Using bundler 2.3.6
Using debase-ruby_core_source 0.10.16
Using ffi 1.15.5
Using msgpack 1.5.6
Using google-protobuf 3.21.5 (x86_64-linux)
Using libddwaf 1.3.0.2.0 (x86_64-linux)
Fetching libdatadog 0.7.0.1.0 (x86_64-linux)
Installing libdatadog 0.7.0.1.0 (x86_64-linux)
Fetching ddtrace 1.3.0
Installing ddtrace 1.3.0 with native extensions
Bundle complete! 3 Gemfile dependencies, 8 gems now installed.
Use `bundle info [gemname]` to see where a bundled gem is installed.
At this point, ddtrace
gets installed and links to libdatadog
0.7.0.1.0.
root@c7311b47f69d:/app/libdatadog-detect-missing# ldd /usr/local/bundle/gems/ddtrace-1.3.0/ext/ddtrace_profiling_native_extension/ddtrace_profiling_native_extension.2.7.3_x86_64-linux.so
linux-vdso.so.1 (0x00007fff51b8e000)
libruby.so.2.7 => /usr/local/lib/libruby.so.2.7 (0x00007f1942476000)
libddprof_ffi.so => /usr/local/bundle/gems/libdatadog-0.7.0.1.0-x86_64-linux/vendor/libdatadog-0.7.0/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/pkgconfig/../../lib/libddprof_ffi.so (0x00007f19421c5000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f1942038000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f1941e77000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f1941c59000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f1941c36000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f1941c2c000)
libgmp.so.10 => /usr/lib/x86_64-linux-gnu/libgmp.so.10 (0x00007f1941ba9000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f1941ba4000)
libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 (0x00007f1941b6a000)
/lib64/ld-linux-x86-64.so.2 (0x00007f1942833000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f1941b50000)
But, let's say as a regular Ruby user, you pick a different version of libdatadog and run bundle install
:
source 'https://rubygems.org'
gem 'google-protobuf'
gem 'ddtrace'
gem 'libdatadog', '= 0.7.0.1.1' # this was changed from 0.7.0.1.0 to 0.7.0.1.1
root@c7311b47f69d:/app/libdatadog-detect-missing# bundle install
Fetching gem metadata from https://rubygems.org/..
Resolving dependencies...
Using bundler 2.3.6
Using msgpack 1.5.6
Using ffi 1.15.5
Using google-protobuf 3.21.5 (x86_64-linux)
Using debase-ruby_core_source 0.10.16
Using libddwaf 1.3.0.2.0 (x86_64-linux)
Fetching libdatadog 0.7.0.1.1 (x86_64-linux) (was 0.7.0.1.0)
Installing libdatadog 0.7.0.1.1 (x86_64-linux) (was 0.7.0.1.0)
Using ddtrace 1.3.0
Bundle complete! 3 Gemfile dependencies, 8 gems now installed.
Use `bundle info [gemname]` to see where a bundled gem is installed.
Now asking for the version of libdatadog on the system will state that you're supposedly using 0.7.0.1.1:
root@c7311b47f69d:/app/libdatadog-detect-missing# bundle exec ruby -e "require 'libdatadog'; puts Libdatadog::VERSION"
0.7.0.1.1
but actually ddtrace
is not using that version:
root@c7311b47f69d:/app/libdatadog-detect-missing# ldd /usr/local/bundle/gems/ddtrace-1.3.0/ext/ddtrace_profiling_native_extension/ddtrace_profiling_native_extension.2.7.3_x86_64-linux.so
linux-vdso.so.1 (0x00007ffc33deb000)
libruby.so.2.7 => /usr/local/lib/libruby.so.2.7 (0x00007f034211c000)
libddprof_ffi.so => /usr/local/bundle/gems/libdatadog-0.7.0.1.0-x86_64-linux/vendor/libdatadog-0.7.0/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/pkgconfig/../../lib/libddprof_ffi.so (0x00007f0341e6b000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f0341cde000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f0341b1d000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f03418ff000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f03418dc000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f03418d2000)
libgmp.so.10 => /usr/lib/x86_64-linux-gnu/libgmp.so.10 (0x00007f034184f000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f034184a000)
libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 (0x00007f0341810000)
/lib64/ld-linux-x86-64.so.2 (0x00007f03424d9000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f03417f6000)
So you think you've upgraded/downgraded where in fact you haven't.
Further proof happens if you actually remove the old version:
root@c7311b47f69d:/app/libdatadog-detect-missing# gem uninstall libdatadog
Select gem to uninstall:
1. libdatadog-0.7.0.1.0-x86_64-linux
2. libdatadog-0.7.0.1.1-x86_64-linux
3. All versions
> 1
Successfully uninstalled libdatadog-0.7.0.1.0-x86_64-linux
root@c7311b47f69d:/app/libdatadog-detect-missing# DD_PROFILING_ENABLED=true bundle exec ddtracerb exec ruby -e "require 'libdatadog'; puts Libdatadog::VERSION"
W, [2022-08-25T09:27:43.103258 #337] WARN -- ddtrace: [ddtrace] Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to 'RuntimeError Failure to load ddtrace_profiling_native_extension.2.7.3_x86_64-linux due to libddprof_ffi.so: cannot open shared object file: No such file or directory' at '/usr/local/bundle/gems/ddtrace-1.3.0/lib/datadog/profiling/load_native_extension.rb:22:in `<top (required)>''
0.7.0.1.1
Bundler is still able to resolve a valid version BUT the profiler is broken since it's not actually using that version.
Since this only happens with libdatadog point releases, and we don't do those often, I doubt anyone's been bitten by this, but it's definitely a sharp edge that we should address.
Expected behaviour:
Ideally, the profiling native extension should automatically pick up and work with updated versions of libdatadog.
At minimum, it should a) detect that there's a mismatched libdatadog version; and b) provide a good error message stating what happened and how to fix it.
Steps to reproduce
(See above)
Addendum: I suspect this issue can also be triggered when changing the platform on libdatadog
, see https://github.com/DataDog/dd-trace-rb/issues/2652#issuecomment-1450539119 for details.
The workaround for this is to reinstall ddtrace
IIRC gem pristine ddtrace
should do it.
I hit this issue when I upgraded datadog agent on vms. I'm confused since I thought libdatadog would bundle the library, not relying on the system lib.
I believe this is a bug of libdatadog not of ddtrace.
Sorry to hear you're affected @chulkilee . Do you by any chance still have access to the logs and can share the error message you got?
I've been thinking that in some cases we may be able to provide a much better error message, but I wanted to doublecheck it would cover your case.
W, [2023-07-04T08:35:31.895350 #29107] WARN -- ddtrace: [ddtrace] (/app/shared/bundle/ruby/2.7.0/gems/ddtrace-1.12.1/lib/datadog/core/configuration/components.rb:103:in `startup!') Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to 'LoadError cannot load such file -- ddtrace_profiling_loader.2.7.8_x86_64-linux' at '/app/shared/bundle/ruby/2.7.0/gems/bootsnap-1.7.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:23:in `require''
Wait, maybe it's between ruby upgrade (2.7.7 and 2.7.8) - I just found that the error message included the path "2.7.0" - which is not specific to ruby version (2.7.7 or 2.7.8). Maybe bundle install
after ruby upgrade didn't trigger "reinstall" libdatadog here
Interesting... I think this one may not involve libdatadog at all (although the error message is similar and the fix is similar as well).
Looking at the log message, I suspect what happened was that when upgrading from 2.7.7 to 2.7.8, ddtrace itself was not reinstalled (and thus not recompiled).
At installation time, ddtrace includes the full Ruby version in the compiled parts of the profiler -- e.g. "ddtrace_profiling_loader.2.7.7_x86_64-linux". Thus, if the same installation gets reused on a different version, then it won't work, because it tries to load a different version.
This is on purpose (to avoid mismatches between the profiler and the Ruby version) but yeah I can definitely see how the error message is super opaque and none of the details I share above are obvious.
And the fix is indeed the same -- reinstall ddtrace.
I'll make a note to detect these errors and provide a better message.
Thanks for sharing the log message btw, it helps a lot!
PR to improve the log message: https://github.com/DataDog/dd-trace-rb/pull/2957