opentelemetry-dotnet-contrib icon indicating copy to clipboard operation
opentelemetry-dotnet-contrib copied to clipboard

HttpClient not setting error message when using CancellationToken

Open jdt opened this issue 1 year ago • 9 comments

Component

OpenTelemetry.Instrumentation.Http

Package Version

Package Name Version
OpenTelemetry.Api 1.8.0
OpenTelemetry 1.8.0

Runtime Version

net8.0

Description

HttpClient supports using the CancellationToken to cancel requests in-flight. The current implementation of the HttpHandlerDiagnosticListener will set an error status in OnStopActivity if the request was canceled and the comments specify that it relies on the OnException handler to set the span status and description. However, when using CancellationToken the OnException handler will not be raised (presumably because the TaskCanceledException that is triggered by a cancellation is already handled and does not bubble up to the OnException handler as you would expect). This means that telemetry exported will have an error status set, but there will be no error message or exception available on the span.

Steps to Reproduce

  • Execute a long-running HTTP call via HttpClient
  • Use the CancellationToken to cancel the request before it completes

Expected Result

I would expect to see some information regarding the nature of the error. It might not be needed to include the TaskCanceledException's full stacktrace here as that exception is 'by design', but there should at least be some feedback available on the span to indicate why the status was set to error.

Actual Result

There is no message or exception available

Additional Context

No response

jdt avatar May 17 '24 10:05 jdt

Some further investigation indeed confirms that simply setting an error message in the OnStopActivity's call to SetStatus will work. The Canceled status is specifically for Task Cancellations so this won't interfere with the existing code.

It would mean a slight change to existing behavior since we'll have an error message where there was none before, but I don't think that constitutes a breaking change. Alternatively, if this doesn't work it would be fairly easy to implement an Enrich operation that gets called when a task gets canceled, which would also fix #1793, and which would allow the desired behavior to be implemented by users of the library.

I would like to create a PR to get this fixed, if someone can have a look and decide what the approach is that should be taken. @Kielek or @vishweshbankwar, any input on this would be greatly appreciated.

jdt avatar May 19 '24 09:05 jdt

@jdt - Your proposed fix looks reasonable to me. Spec also suggests setting the status to error. Not sure what the error.type attribute would be in this case. Adding @antonfirsov to see if this case is handled in metrics on .NET8.0

vishweshbankwar avatar May 21 '24 18:05 vishweshbankwar

Not sure what the error.type attribute would be in this case

In SocketsHttpHandler/HttpClientHandler it is the namespace-qualified exception name in this case, ie. System.Threading.Tasks.TaskCanceledException or System.OperationCanceledException.

antonfirsov avatar May 21 '24 19:05 antonfirsov

Thanks for the feedback, I've opened a PR so we can discuss any other implementation details

jdt avatar May 22 '24 08:05 jdt

Not sure what the error.type attribute would be in this case

In SocketsHttpHandler/HttpClientHandler it is the namespace-qualified exception name in this case, ie. System.Threading.Tasks.TaskCanceledException or System.OperationCanceledException.

@antonfirsov - Is this case handled in http.client.request.duration metric in .NET8.0? Want to ensure we will be consistent with that.

vishweshbankwar avatar May 22 '24 16:05 vishweshbankwar

Is this case handled in http.client.request.duration metric in .NET8.0?

Yes, see https://learn.microsoft.com/en-us/dotnet/core/diagnostics/built-in-metrics-system-net#metric-httpclientrequestduration.

antonfirsov avatar May 22 '24 19:05 antonfirsov

What makes this a bit harder is that the .NET8.0 Metric implementation relies on the actual thrown exception type, which can be either TaskCanceledException or OperationCanceledException. While this is absolutely the right thing to do, the problem is that the HttpHandlerDiagnosticListener doesn't have access to the exception and can only rely on the TaskStatus. Documentation there explicitly states that this is an OperationCanceledException, as is also visible in the code. But since OperationCanceledException is the parent exception type for both TaskCanceledException and OperationCanceledException, we can't be sure of the actualy error.type. Both a TaskCanceledException and an OperationCanceledException are reported as TaskStatus.Canceled.

I would personally prefer to use the OperationCanceledException as it is considered to be the 'blanket' error type covering both cases. I think the differences between both types are implementation details and Microsoft's own recommendation for client code is to catch OperationCanceledException, which also includes TaskCanceledException anyway.

If we want to make this 100% consistent between Traces and Metrics this would involve a change in the .NET Framework itself so it can report task cancellations with an actual detail of the error type.

jdt avatar May 23 '24 06:05 jdt

@jdt - OperationCanceledException for metric also sounds good to me. Eventually the tracing and metric should get consistent when https://github.com/dotnet/runtime/issues/93019 is done in .NET9.0.

@antonfirsov - What do you think?

vishweshbankwar avatar May 28 '24 17:05 vishweshbankwar

Note that TaskCanceledException is more common to be thrown by HttpClient in practice. If you think that will not increase confusion, I'm fine with System.OperationCanceledException, agree that it's more universal.

antonfirsov avatar Jun 03 '24 16:06 antonfirsov

Since this change only affects .NET 8 and we rely on the built-in metrics handling, I'm afraid we can only set the right error type in the HttpHandlerDiagnostics listener. For now this cannot be made consistent, but with .NET9.0 that should be possible.

jdt avatar Jul 15 '24 07:07 jdt