NServiceBus.Extensions.Diagnostics icon indicating copy to clipboard operation
NServiceBus.Extensions.Diagnostics copied to clipboard

Exception Handling

Open jamesfera opened this issue 4 years ago • 7 comments

I am unable to find any mention of exception handling using the Activity API. I am curious about your thoughts on this. As it stands, Incoming/Outgoing Message failures are not captured and are therefore unavailable to downstream telemetry.

https://github.com/jbogard/NServiceBus.Extensions.Diagnostics/blob/b7ef20b5f579809c5ed69ac7518c9d1ddae89884/src/NServiceBus.Extensions.Diagnostics/OutgoingPhysicalMessageDiagnostics.cs#L21-L37

Something like this would allow the Activity to propagate the Exception (admittedly in an awkward way):

using (var activity = StartActivity(context))
{
  if (activity != null)
  {
      InjectHeaders(activity, context);
  }
  try
  {
    await next().ConfigureAwait(false);
  }
  catch(Exception ex)
  {
    activity.SetCustomProperty("Exception", ex);
  }
  finally
  {
    if (_diagnosticListener.IsEnabled(EventName))
    {
        _diagnosticListener.Write(EventName, context);
    }
  }
}

Have you thought about this? Is there a better way?

jamesfera avatar Feb 09 '21 20:02 jamesfera

I guess you'd want to re-throw the exceptions recoverability can kick in.

DorianGreen avatar Apr 06 '21 18:04 DorianGreen

Yeah I think the problem is there's no easy way to detect when a message is sent to the error queue. It's just not exposed today. I can do the exception but I think the better route would be to hook directly into the NSB error processing, which is also not exposed today.

jbogard avatar Feb 10 '22 02:02 jbogard

And it is not feasible to do something with this kind of hook in the Feature.Setup method?

var recoverability = context.Settings.Get<RecoverabilitySettings>();
recoverability.Failed(settings => settings.OnMessageSentToErrorQueue(message =>
{
    System.Diagnostics.Activity.Current?.SetCustomProperty("exception", message.Exception);
    return Task.CompletedTask;
}));

jgo-bvb avatar May 12 '22 12:05 jgo-bvb

Also setting the Activity Status is important for Application Insights Telemetry.

activity.SetStatus(ActivityStatusCode.Ok);
activity.SetStatus(ActivityStatusCode.Error, "error description"); // exception stacktrace as description?

"Yeah I think the problem is there's no easy way to detect when a message is sent to the error queue." I don't think this should be definition of an error. I don't care if the message is being retried or ends op in the error queue: it failed and I want to see that :-).

I want to see the retries in the Telemetry: I think this is valuable information to show (it helped me multiple times to identify issues). The only downside is that Exceptions are overly reported, because of the retries, but I prefer this, then to hide them.

RemyDuijkeren avatar Jun 28 '22 09:06 RemyDuijkeren

I don't think I saw that failure config, this should be easy enough to add.

jbogard avatar Jun 28 '22 13:06 jbogard

From my tests, just setting the status on the activity doesn't propagate that information into the exporter tool.

To make that work, you need to set specific Otel tags:

activity?.SetTag("otel.status_code", "ERROR");
activity?.SetTag("otel.status_description", ex.Message);

lailabougria avatar Jul 01 '22 07:07 lailabougria

Reading this issue here https://github.com/open-telemetry/opentelemetry-dotnet/issues/2569 , its not being doing on purpose. It's saying it's the responsibility of the exporter(s) to generate those tags.

RemyDuijkeren avatar Jul 01 '22 09:07 RemyDuijkeren