dd-trace-dotnet
dd-trace-dotnet copied to clipboard
Bundled Tracer fails to load with .NET 8 and COMPlus_EnableDiagnostics=0
Describe the bug
While upgrading our applications to .NET 8 we experienced missing Traces and APM metrics. We then found out that the bundled tracer fails to load / attach to the .NET process when running with .NET 8 and having the environment variable COMPlus_EnableDiagnostics=0
set.
To Reproduce Steps to reproduce the behavior:
- Create a new dotnet webapp using .NET 8 with reference to Datadog.Trace.Bundle
- Set env variables for Tracer as well as COMPlus_EnableDiagnostics=0
- Run application
- Tracer is not loaded - Traces are missing!
Have a look at my minimal reproducible repo
As you can see in this run, the bundled tracer is successfully loaded with .NET 7 regardless whether COMPlus_EnableDiagnostics=0
is set or not. But with .NET 8 it fails to load with COMPlus_EnableDiagnostics=0
:
[WARNING]: The native loader library is not loaded into the process
Note: the lines "[FAILURE]: Error connecting to Agent" can be ignored - That is expected because I'm not running an agent :-)
Expected behavior
The tracer works with COMPlus_EnableDiagnostics=0
~Screenshots~Visualization
Bundled Tracer works? | .NET 7 | .NET 8 |
---|---|---|
COMPlus_EnableDiagnostics=1 |
✅ | ✅ |
COMPlus_EnableDiagnostics=0 |
✅ | ❌ |
Runtime environment (please complete the following information):
- Instrumentation mode: automatic with
Datadog.Trace.Bundle
- Tracer version: 2.44.0
- OS: Debian GNU/Linux 12 (bookworm)
- CLR: .NET 8.0.0
Additional context I only tested this in a containerized environment, but assume that other environments are also affected
I just realized that the linked logs are apparently private, so here is what I get:
✅ .NET 7 COMPlus_EnableDiagnostics=0
Running checks on process 1
Process name: dotnet
---- STARTING TRACER SETUP CHECKS -----
Target process is running with .NET Core
1. Checking Modules Needed so the Tracer Loads:
[SUCCESS]: The tracer version 2.44.0.0 is loaded into the process.
2. Checking DD_DOTNET_TRACER_HOME and related configuration value:
[SUCCESS]: DD_DOTNET_TRACER_HOME is set to '/app/datadog' and the directory was
found correctly.
3. Checking CORECLR_PROFILER_PATH and related configuration value:
[SUCCESS]: The environment variable CORECLR_PROFILER_PATH is set to the correct
value of /app/datadog/linux-x64/Datadog.Trace.ClrProfiler.Native.so.
4. Checking CORECLR_PROFILER and related configuration value:
[SUCCESS]: The environment variable CORECLR_PROFILER is set to the correct
value of {846F5F1C-F9AE-4B07-969E-05C26BC060D8}.
5. Checking CORECLR_ENABLE_PROFILING and related configuration value:
[SUCCESS]: The environment variable CORECLR_ENABLE_PROFILING is set to the
correct value of 1.
---- CONFIGURATION CHECKS -----
1. Checking if tracing is disabled using DD_TRACE_ENABLED.
[INFO]: DD_TRACE_ENABLED is not set, the default value is true.
2. Checking if profiling is enabled using DD_PROFILING_ENABLED.
[INFO]: DD_PROFILING_ENABLED is not set, the continuous profiler is disabled.
---- DATADOG AGENT CHECKS -----
Detected agent url: http://127.0.0.1:8126/. Note: this url may be incorrect if
you configured the application through a configuration file.
Connecting to Agent at endpoint http://127.0.0.1:8126/ using HTTP
[FAILURE]: Error connecting to Agent at http://127.0.0.1:8126/: Connection
refused (127.0.0.1:8126)
✅ .NET 7 COMPlus_EnableDiagnostics unset
Running checks on process 1
Process name: dotnet
---- STARTING TRACER SETUP CHECKS -----
Target process is running with .NET Core
1. Checking Modules Needed so the Tracer Loads:
[SUCCESS]: The tracer version 2.44.0.0 is loaded into the process.
2. Checking DD_DOTNET_TRACER_HOME and related configuration value:
[SUCCESS]: DD_DOTNET_TRACER_HOME is set to '/app/datadog' and the directory was
found correctly.
3. Checking CORECLR_PROFILER_PATH and related configuration value:
[SUCCESS]: The environment variable CORECLR_PROFILER_PATH is set to the correct
value of /app/datadog/linux-x64/Datadog.Trace.ClrProfiler.Native.so.
4. Checking CORECLR_PROFILER and related configuration value:
[SUCCESS]: The environment variable CORECLR_PROFILER is set to the correct
value of {846F5F1C-F9AE-4B07-969E-05C26BC060D8}.
5. Checking CORECLR_ENABLE_PROFILING and related configuration value:
[SUCCESS]: The environment variable CORECLR_ENABLE_PROFILING is set to the
correct value of 1.
---- CONFIGURATION CHECKS -----
1. Checking if tracing is disabled using DD_TRACE_ENABLED.
[INFO]: DD_TRACE_ENABLED is not set, the default value is true.
2. Checking if profiling is enabled using DD_PROFILING_ENABLED.
[INFO]: DD_PROFILING_ENABLED is not set, the continuous profiler is disabled.
---- DATADOG AGENT CHECKS -----
Detected agent url: http://127.0.0.1:8126/. Note: this url may be incorrect if
you configured the application through a configuration file.
Connecting to Agent at endpoint http://127.0.0.1:8126/ using HTTP
[FAILURE]: Error connecting to Agent at http://127.0.0.1:8126/: Connection
refused (127.0.0.1:8126)
❌ .NET 8 COMPlus_EnableDiagnostics=0
Running checks on process 1
Process name: dotnet
---- STARTING TRACER SETUP CHECKS -----
Target process is running with .NET Core
1. Checking Modules Needed so the Tracer Loads:
[WARNING]: The native loader library is not loaded into the process
[WARNING]: The native tracer library is not loaded into the process
[SUCCESS]: The tracer version 2.44.0.0 is loaded into the process.
2. Checking DD_DOTNET_TRACER_HOME and related configuration value:
[SUCCESS]: DD_DOTNET_TRACER_HOME is set to '/app/datadog' and the directory was
found correctly.
3. Checking CORECLR_PROFILER_PATH and related configuration value:
[SUCCESS]: The environment variable CORECLR_PROFILER_PATH is set to the correct
value of /app/datadog/linux-x64/Datadog.Trace.ClrProfiler.Native.so.
4. Checking CORECLR_PROFILER and related configuration value:
[SUCCESS]: The environment variable CORECLR_PROFILER is set to the correct
value of {846F5F1C-F9AE-4B07-969E-05C26BC060D8}.
5. Checking CORECLR_ENABLE_PROFILING and related configuration value:
[SUCCESS]: The environment variable CORECLR_ENABLE_PROFILING is set to the
correct value of 1.
6. Checking if process tracing configuration matches Installer or Bundler:
Installer related documentation:
https://docs.datadoghq.com/tracing/trace_collection/dd_libraries/dotnet-core?tab
=linux#install-the-tracer
[FAILURE]: Error trying to check the Linux installer directory: Could not find
a part of the path '/opt/datadog'.
Note the lines:
[WARNING]: The native loader library is not loaded into the process
✅ .NET 8 COMPlus_EnableDiagnostics unset
Running checks on process 1
Process name: dotnet
---- STARTING TRACER SETUP CHECKS -----
Target process is running with .NET Core
1. Checking Modules Needed so the Tracer Loads:
[SUCCESS]: The tracer version 2.44.0.0 is loaded into the process.
2. Checking DD_DOTNET_TRACER_HOME and related configuration value:
[SUCCESS]: DD_DOTNET_TRACER_HOME is set to '/app/datadog' and the directory was
found correctly.
3. Checking CORECLR_PROFILER_PATH and related configuration value:
[SUCCESS]: The environment variable CORECLR_PROFILER_PATH is set to the correct
value of /app/datadog/linux-x64/Datadog.Trace.ClrProfiler.Native.so.
4. Checking CORECLR_PROFILER and related configuration value:
[SUCCESS]: The environment variable CORECLR_PROFILER is set to the correct
value of {846F5F1C-F9AE-4B07-969E-05C26BC060D8}.
5. Checking CORECLR_ENABLE_PROFILING and related configuration value:
[SUCCESS]: The environment variable CORECLR_ENABLE_PROFILING is set to the
correct value of 1.
---- CONFIGURATION CHECKS -----
1. Checking if tracing is disabled using DD_TRACE_ENABLED.
[INFO]: DD_TRACE_ENABLED is not set, the default value is true.
2. Checking if profiling is enabled using DD_PROFILING_ENABLED.
[INFO]: DD_PROFILING_ENABLED is not set, the continuous profiler is disabled.
---- DATADOG AGENT CHECKS -----
Detected agent url: http://127.0.0.1:8126/. Note: this url may be incorrect if
you configured the application through a configuration file.
Connecting to Agent at endpoint http://127.0.0.1:8126/ using HTTP
[FAILURE]: Error connecting to Agent at http://127.0.0.1:8126/: Connection
refused (127.0.0.1:8126)
Hi @marcovr, thanks for flagging this. It appears that this was a behavior change in .NET 8 which disables the profiling APIs we rely on when you set COMPlus_EnableDiagnostics=0
.
Unfortunately, as it's in the runtime, there's nothing we can do about it, however they suggest the following workaround:
To emulate previous behavior, I suggest setting the following to ensure the behavior is as intended:
DOTNET_EnableDiagnostics=1
DOTNET_EnableDiagnostics_IPC=0
DOTNET_EnableDiagnostics_Debugger=0
DOTNET_EnableDiagnostics_Profiler=1
Could you give that a try and make sure that fixes your issue? Thanks!
Ohh I see. Not your fault then 😉 I can confirm that your suggestion works as expected.
But maybe it could be worth adding a note in the Readme / setup documentation about this change? I spent quite a while trying to figure out what was causing the issue
Yep, makes sense - will look at getting that added somewhere - thanks! 🙂
Having DOTNET_EnableDiagnostics=1 though prevents the use of read-only root filesystem for dotnet containers. Any work-around for having dotnet, read-only root filesystem, AND datadog tracing all at the same time?
@nwesoccer that was the reason why we had originally set it to 0 as well 😄
But if you set all the following environment variables, it does indeed work with a readonly filesystem because no IPC files are written:
DOTNET_EnableDiagnostics=1
DOTNET_EnableDiagnostics_IPC=0
DOTNET_EnableDiagnostics_Debugger=0
DOTNET_EnableDiagnostics_Profiler=1
See the corresponding docs
@marcovr I'm sorry, my test scenario was dotnet 7 as we have projects with both 7 and 8. I suppose that means for dotnet 7 we'll need DOTNET_EnableDiagnostics=0 (since the above list doesn't work with dotnet 7 and read-only) and for dotnet 8 the above mentioned list does work for dotnet8 and read-only?
Yes, exactly. We solved this by building customized base images where depending on the .NET version, a different set of variables is set.
@marcovr Makes sense, Thanks!!
Just FYI, we've added detection of this scenario to the dd-dotnet
diagnostic tool
- https://github.com/DataDog/dd-trace-dotnet/pull/5208
Out of interest though @marcovr/@nwesoccer why are you setting COMPlus_EnableDiagnostics=0
/DOTNET_EnableDiagnostics=0
🤔
Cool, thanks 🙂
The reason why we had set COMPlus_EnableDiagnostics=0
was to run our application in containers with a read only file system.
When running any .NET application on a read only file system without this variable set, the runtime fails to start and produces the following output:
Failed to create CoreCLR, HRESULT: 0x8007000E
I suppose this happens because the runtime fails creating the debug pipes.
Interestingly though, this appears to have been fixed with .NET 8.