ApplicationInsights-dotnet
ApplicationInsights-dotnet copied to clipboard
Support percentiles for aggregated metrics
<<I suspect this qualifies as an enhancement request>> Per this StackOverflow answer, it’s not possible to do percentiles on aggregated metrics sent through AppInsights. https://stackoverflow.com/questions/58124268/how-to-do-percentiles-on-custom-metrics-in-azure-appinsights
The request is to support this in some form, since it seems like a significant miss relative to other platforms like Prometheus. Is there any workaround other than sending telemetry for every metric measurement (since that won’t scale at all)?
I would love not to have to set up Prometheus/Grafana infrastructure to support this. Thanks!
@vgorbenko Is this something in Metrics roadmap.. ?
Any indication of how close this might be?
For high-volume scenarios it makes AppInsights unusable for metrics (since simple averages won't cut it for production monitoring). If there's a solution AppInsights provides here that I'm missing please let me know (our plan is to track aggregate metrics for billions of events/day).
@andyvig This is not planned for 2019. I will check and report back the plan for next semester. (2020). I also know that its possible for you to write custom aggregator and plug into rest of metrics pipeline if you want to do percentlies. Its not documented, but if you want to take a look, heres where to start looking: https://github.com/microsoft/ApplicationInsights-dotnet/blob/develop/src/Microsoft.ApplicationInsights/Metrics/Extensibility/MetricSeriesAggregatorBase.cs
Thanks @cijothomas, how would we then query that on the Log Analytics side? Does the percentile function support aggregate data?
I'm looking for something similar to this operation in Prometheus:
"To calculate the 90th percentile of request durations over the last 10m"
histogram_quantile(0.9, rate(http_request_duration_seconds_bucket[10m]))
From https://prometheus.io/docs/prometheus/latest/querying/functions/#histogram_quantile
Don't think there exists any native support as schema dont have anything for storing percentiles.https://github.com/microsoft/ApplicationInsights-dotnet/blob/develop/src/Microsoft.ApplicationInsights/Extensibility/Implementation/External/DataPoint_types.cs
You'd need to store quantiles as customProps, and do custom queries to get them, as Analytics wont understand customProps.
@SergeyKanzhelev even if one authors own aggregator, any way to store quantiles (.1,.5..9 etc) in schema?
Any news / roadmap item / documentation / customer guidance of
- publishing metrics as histograms to AppInsights
- with the goal of using percentiles in Queries/Views/Alerts
to make AppInsights a good fit for SLOs?
No work is planned to add support for this in ApplicationInsights SDK.
The Metrics support in OpenTelemetry is coming by end of 2021 (nov 2021) - https://github.com/open-telemetry/opentelemetry-dotnet/issues/1501. After the OpenTelemetry part is shipped, there'd be a supported way to export metrics to ApplicationInsights, but no solid dates for this. Also no solid date for supporting percentiles/histogram in ApplicationInsights.
This issue is stale because it has been open 300 days with no activity. Remove stale label or comment or this will be closed in 7 days.
@cijothomas Checking in here...
This still a "maybe sometime in the future but all dates unknown" situation or is there any more definition around if/when this might be supported? Thanks.
No firm dates that I can share. (the feature requires not just SDK support, but backends/UI etc.). From SDK side, this will likely come via OpenTelemetry route, and not from this repo.
This issue is stale because it has been open 300 days with no activity. Remove stale label or this will be closed in 7 days. Commenting will instruct the bot to automatically remove the label.
We still don't have a clear statement, if and how this will come.
If this feature is not coming in the Azure Monitor / AppInsights backend and various SDKs, there should be some guidance published, how these technologies could be used if someone wants to follow SRE best practices:
Hi Ricardo, Azure Managed Prometheus (Preview) was announced last month and is available with Azure Managed Grafana integration. This is compatible with Prom Client.
https://learn.microsoft.com/azure/azure-monitor/essentials/prometheus-metrics-overview
Additionally we are working on supporting percentiles via the OpenTelemetry histogram API. Unfortunately this work requires some major changes in how our backend works and thus any release is likely 6+ months out.
CC: @vishiy
Thanks a lot for clarification and details around workarounds/other possibilities.
This issue is stale because it has been open 300 days with no activity. Remove stale label or this will be closed in 7 days. Commenting will instruct the bot to automatically remove the label.
@mattmccleary can you provide an update on this? Thx
@mattmccleary any update on percentile tracking support?
I'm trying to use 'Azure.Monitor.OpenTelemetry.Exporter' to collect and report on our application latency. Looks like currently it tracks things like max value, but it's not very useful for practical purposes since max value can be influenced by a lot of external factors and doesn't necessarily provide an accurate view of how the app is doing. Ideally we want to track the 99-th percentile of this latency value, but I can't figure out how to do that or if it's supported at all.