newrelic-dotnet-agent
newrelic-dotnet-agent copied to clipboard
Infinite tracing causes an exception when running in Alpine linux docker container
Bug
Enabling Infinite Tracing when running the agent in an Alpine Linux Docker container causes the following exception:
2020-10-07 12:27:07,093 NewRelic DEBUG: [pid: 1, tid: 19] SpanStreamingService: Error creating gRPC channel to endpoint ... . (attempt 0) - Exception: NewRelic.Agent.Core.DataTransport.GrpcWrapperException: Unable to create new gRPC Channel
---> System.IO.IOException: Error loading native library "/app/newrelic/libgrpc_csharp_ext.x64.so". Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by /app/newrelic/libgrpc_csharp_ext.x64.so)
at Grpc.Core.Internal.UnmanagedLibrary..ctor(String[] libraryPathAlternatives)
at Grpc.Core.Internal.NativeExtension.LoadUnmanagedLibrary()
at Grpc.Core.Internal.NativeExtension.LoadNativeMethods()
at Grpc.Core.Internal.NativeExtension..ctor()
at Grpc.Core.Internal.NativeExtension.Get()
at Grpc.Core.GrpcEnvironment.GrpcNativeInit()
at Grpc.Core.GrpcEnvironment..ctor()
at Grpc.Core.GrpcEnvironment.AddRef()
at Grpc.Core.Channel..ctor(String target, ChannelCredentials credentials, IEnumerable`1 options)
at NewRelic.Agent.Core.DataTransport.GrpcWrapper`2.CreateChannel(String host, Int32 port, Boolean ssl, Metadata headers, Int32 connectTimeoutMs, CancellationToken cancellationToken)
— End of inner exception stack trace —
at NewRelic.Agent.Core.DataTransport.GrpcWrapper`2.CreateChannel(String host, Int32 port, Boolean ssl, Metadata headers, Int32 connectTimeoutMs, CancellationToken cancellationToken)
at NewRelic.Agent.Core.DataTransport.DataStreamingService`3.CreateChannel(CancellationToken cancellationToken)
Workaround
This is related to https://github.com/grpc/grpc/issues/21446, and a workaround is provided.
⚠️ Note, however, there is an open question regarding a security vulnerability in relation to the suggested workaround.
Testing to determine if this Issue is the same as #394, which occurs with .NET 5. Tested with .NET Core 3.1 and Alpine version 3.12
Used the following images:
FROM mcr.microsoft.com/dotnet/aspnet:3.1-alpine3.12 AS base
FROM mcr.microsoft.com/dotnet/sdk:3.1 AS build
Used the following gRPC package versions in the Agent:
<PackageReference Include="Google.Protobuf" Version="3.11.4" />
<PackageReference Include="Grpc" Version="2.35.0" />
<PackageReference Include="Grpc.Core" Version="2.35.0" />
<PackageReference Include="Grpc.Tools" Version="2.28.1">
Used the build_functions.ps1 script from this PR.
Used a basic asp.net test app (Vu's InfiniteTraceDemo).
Observed the following errors in the Agent log:
2021-01-26 17:30:31,010 NewRelic DEBUG: [pid: 1, tid: 12] SpanStreamingService: Error creating gRPC channel to endpoint 092fb164-247e-4993-acb7-84b4f6c7f135.aws-us-east-2.tracing.staging-edge.nr-data.net:443. (attempt 0) - Exception: NewRelic.Agent.Core.DataTransport.GrpcWrapperException: Unable to create new gRPC Channel
---> System.DllNotFoundException: Unable to load shared library 'grpc_csharp_ext.x64' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: Error loading shared library libgrpc_csharp_ext.x64: No such file or directory
at Grpc.Core.Internal.NativeMethods.DllImportsFromSharedLib_x64.grpcsharp_redirect_log(GprLogDelegate callback)
at Grpc.Core.Internal.NativeLogRedirector.Redirect(NativeMethods native)
at Grpc.Core.Internal.NativeExtension..ctor()
at Grpc.Core.Internal.NativeExtension.Get()
at Grpc.Core.Internal.NativeMethods.Get()
at Grpc.Core.GrpcEnvironment.GrpcNativeInit()
at Grpc.Core.GrpcEnvironment..ctor()
at Grpc.Core.GrpcEnvironment.AddRef()
at Grpc.Core.Channel..ctor(String target, ChannelCredentials credentials, IEnumerable`1 options)
at Grpc.Core.Channel..ctor(String host, Int32 port, ChannelCredentials credentials, IEnumerable`1 options)
at Grpc.Core.Channel..ctor(String host, Int32 port, ChannelCredentials credentials)
at NewRelic.Agent.Core.DataTransport.GrpcWrapper`2.CreateChannel(String host, Int32 port, Boolean ssl, Metadata headers, Int32 connectTimeoutMs, CancellationToken cancellationToken)
--- End of inner exception stack trace ---
at NewRelic.Agent.Core.DataTransport.GrpcWrapper`2.CreateChannel(String host, Int32 port, Boolean ssl, Metadata headers, Int32 connectTimeoutMs, CancellationToken cancellationToken)
at NewRelic.Agent.Core.DataTransport.DataStreamingService`3.CreateChannel(CancellationToken cancellationToken)
While different from the initially reported error, the gist is that required files/dlls cannot be found.
As the log suggests consider setting the LD_DEBUG environment variable - but verify LD_DEBUG is supported by Alpine.
Pausing this investigation at this point due to change in priority.
This issue is still present and means that infinite tracing is not usable on Alpine Linux without adding additional packages to the base image exposing it to known CVEs. You may want to update the documentation to clearly state this environment is not supported or at least that infinite tracing is unavailable in the Unavailable features section.
@amweiss Sorry for the delay in following up on this. We had hoped to find a fix for this issue in a recent bug fix milestone, but we did not. I have made the documentation update that you requested: https://github.com/newrelic/docs-website/pull/1357
@nr-ahemsath is there a new milestone which includes fix of this issue.
Relates to #520
@pawel-przybyla This issue is currently in an upcoming "Bug Smash" milestone for our team, but it has not been scheduled yet as far out as September.
12/10/2021: We need to switch grpc library to the managed version first before we can address this issue.
Re: vulnerability, why not to stick to the latest version? Vulnerability exists up to (including) 1.1.23, while latest version is 1.2.3 https://git.musl-libc.org/cgit/musl
https://issues.newrelic.com/browse/NEWRELIC-3505
Jira CommentId: 116271 Commented by angelatan:
After reviewing this ticket, we decided that Alpine Linux + Infinite Tracing combination is a low usage use case. Over the last year, it has not risen to a priority level. We have decided that this is a won't fix. Please open another ticket, or reopen the GitHub issue if this continues or is causing issues on your front.
This issue won't be actioned.