ApplicationInsights-dotnet
ApplicationInsights-dotnet copied to clipboard
Polly based retries result into telemetry for retried requests not getting captured
Microsoft publishes Polly based package which allows developers to configure retries for requests based on certain condition being true.
However, using this package results into retries not being captured in AI because AI thinks of them as duplicate requests and just captures the very first request.
I think this is happening due this following logic in AI package
I am adding a small repro of the issue. In startup.cs if you uncomment the line
AddHttpMessageHandler<TraceParentStrippingHandler>()
You will see 3 entries being recorded, otherwise only one (the very first request).
The zip file is empty can you upload it again?
I think this is happening due this following logic in AI package
Yes, this is by design.
For now the only workaround is to remove traceparent
header for retries, so that ApplicationInsights no longer ignores it.
Long term, this will be fixed via OpenTelemetry route, as OpenTelemetry folks are working on a specification on how to report telemetry from Http Retry/Redirects.
@cijothomas If we remove the traceparent header for the retries, will the operation id of the logged entry be different from the operation id of the initial request? Because then we wouldn't be able to group calls by the same operation Id right? We'd have one call with operation id "abc" and then the retries would all have different operation id's?
@cijothomas If we remove the traceparent header for the retries, will the operation id of the logged entry be different from the operation id of the initial request? Because then we wouldn't be able to group calls by the same operation Id right? We'd have one call with operation id "abc" and then the retries would all have different operation id's?
I don't think so. The parent's Activity.Id will still used to populate operationid,parentid for the DependencyTelemetry.
In case anyone interested, heres the PR fixing this issue in OpenTelemetry https://github.com/open-telemetry/opentelemetry-dotnet/pull/3072
@cijothomas If I am understanding the fix correctly, it is the same fix I had proposed offline to you right? Remove the headers so the system thinks of it being a unique request.
Its not really a "fix". There is a spec in OpenTelemetry on how to report spans for retries. and OpenTelemetry is following that spec. Impln. wise - yes it means if there is a traceparent, its "overrridden" with the new traceparent, and every retry attempt creates a new span.
Long term, this will be fixed via OpenTelemetry route
It looks like the change underlying the referenced OpenTelemetry PR (#3072) has been merged. Does that mean we should expected this issue to be resolved "the right way" in the near future (such that we can roll back remove traceparent
header the workaround)?
(@cijothomas I didn't fully understand your most recent comment, apologies if my question is redundant.)
@mirekkukla No fix for this is planned in this repo. The PR linked is showing how OpenTelemetry is solving this issue, and it'll help Application Insights users only if you are on the OTel Based SDK which is in preview mode now : https://learn.microsoft.com/en-us/azure/azure-monitor/app/opentelemetry-enable?tabs=net
If you are using Application Insights' current SDK (the ones shipped from this repo), no fix has been done for this issue.
This issue is stale because it has been open 300 days with no activity. Remove stale label or this will be closed in 7 days. Commenting will instruct the bot to automatically remove the label.