opentelemetry-dotnet icon indicating copy to clipboard operation
opentelemetry-dotnet copied to clipboard

Support UpDownCounter

Open cijothomas opened this issue 4 years ago • 20 comments

UpDownCounter and its async version are two instruments which are part of the spec, but not currently available in .NET API. This issue is to keep track of supporting it. As Metric API is part of .NET runtime itself, the earliest (if at all) possible window is .NET 7 coming end of 2022.

Why .NET did not add UpDownCounter?

UpDownCounter is typically used for tracking "queue_size", where user would do Add() and Remove() to the instrument. In case of .NET, there are existing tools like Dotnet-Counters, VisualStudio, which has the ability to attach to and collect metrics from a running process. For an instrument like UpDownCounter, unless these tools are attached at startup itself, there is no way for these tools to know the current "queue_size". It can only know the Adds/Removes since the tool was started. This means, the original purpose of the instrument, was not going to be met, when using these tools (which are already part of .NET ecosystem).

Due to this, .NET did not include UpDownCounter in the 1st release (.NET 6). Based on user feedback, this might be added in a future version.

Further reading: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/supplementary-guidelines.md#instrument-selection

cijothomas avatar Sep 17 '21 00:09 cijothomas

This seems to ignore the fact that some customers use metrics for things other than "queue_size" or process-related things in general. For example, in my stock trading application I'm currently using ApplicationInsights to track profit metrics for each trade that my bot performs. Sometimes my profit is negative, so I track a negative value using TelemetryClient.GetMetric(metricId).TrackValue(profit, dimension1, dimension2, dimension3);.

I'm looking for a way to migrate away from using ApplicationInsights, and my research led me to conclude that the new Metrics API in .NET is the way forwards. Unfortunately, without support for tracking negative values, I'm going to be stuck with ApplicationInsights for far longer than I had hoped.

@cijothomas Can you recommend other alternatives for me? I'm not happy with ApplicationInsights as it appears to be vendor-specific and I'm not getting the functionality I need from it (and the charting in Azure is terrible and keeps breaking). I'd like to move to Grafana Cloud, and not have my metrics stored in Azure at all, but I don't know which way to go now.

shaynevanasperen avatar Sep 17 '21 04:09 shaynevanasperen

@shaynevanasperen Keep in mind that the metrics API is still not considered stable, but more than likely your equivalent would be something more like the AsynchroniousGauge It's an instrument which allows for tracking of a value.

hdost avatar Sep 29 '21 22:09 hdost

@shaynevanasperen - Agreed with @hdost. The API names OpenTelemetry settled on differed from the original concept names during design so the API to check out is ObservableGauge

The way that one works is you supply a delegate and OpenTelemetry will invoke your delegate at each reporting interval to get the value you want to report. You can compute your value however you wish in the delegate and it can be positive or negative.

noahfalk avatar Oct 10 '21 01:10 noahfalk

I am also struggling with the lack of an UpDownCounter. In my case, I have a metric that gets adjusted in multiple places up and down with various dimensions. Unless I choose to create my own structure for maintaining this value and then reference that from the observable gauge callback, I can't think of any way to produce what should be a simple gauge with dimensions. Feels very unnatural for this scenario.

pcwiese avatar Nov 03 '21 19:11 pcwiese

@cijothomas I'm trying to do some research in prep for a larger discussion on a separate issue, and was wondering, would a PollingCounter actually be an implementation of UpDownCounter but with a different name?

Here's some documentation that clued me in to this: https://docs.microsoft.com/en-us/dotnet/core/diagnostics/event-counters#net-core-runtime-example-counters

ktmitton avatar Nov 26 '21 22:11 ktmitton

PollingCounter is part of the EventCounter API from .NET. The new Metrics API (the one which is based on OpenTelemetry Metric API spec) is not related to that. See comparison: https://docs.microsoft.com/en-us/dotnet/core/diagnostics/compare-metric-apis

cijothomas avatar Nov 30 '21 06:11 cijothomas

I'm taking a look at UpDownCounter and wondering if it could be used to show the current queue size in a distributed system. For instance, if I have many processes adding items to a queue and they are all calling Add(1) whenever an item is enqueued and then I also have many processed that each dequeuing 1 entry at a time and calling Add(-1), would all of these measurements across processes be able to be aggregated together to give me the current queue size?

ejsmith avatar Mar 16 '22 19:03 ejsmith

all of these measurements across processes be able to be aggregated together to give me the current queue size?

Yes to my knowledge. This should be supported in any metrics backends like Prometheus..

cijothomas avatar Mar 16 '22 19:03 cijothomas

Ok, interesting. Is there an ability to set a point in time absolute value when it gets out of sync?

ejsmith avatar Mar 16 '22 20:03 ejsmith

Ok, interesting. Is there an ability to set a point in time absolute value when it gets out of sync?

I didn't quite understand the question. (Given we don't yet have this in .NET, you might benefit from asking this in https://cloud-native.slack.com/archives/C01NP3BV26R Otel-Metrics slack channel where other language maintainers can give more concrete answers as they already have UpDownCounter..)

cijothomas avatar Mar 16 '22 20:03 cijothomas

Update : DiagnosticSource version 7.0, which ships with .NET 7 should contain API support for UpDownCounter. See https://github.com/dotnet/runtime/issues/63648

cijothomas avatar Mar 16 '22 20:03 cijothomas

@cijothomas yeah, I saw that and it's what prompted me to ask questions here. :-) I tried accessing that slack account and wasn't able to. Is there some other place to ask?

ejsmith avatar Mar 16 '22 20:03 ejsmith

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/supplementary-guidelines.md#instrument-selection You can open issue in the spec repo.

Regd. Slack - I think you need to join the CNCF Slack organization to access the channels..

cijothomas avatar Mar 17 '22 00:03 cijothomas

Uncertain whether this would satisfy your scenario, but current queue size in this case may be better tracked with an Async UpDownCounter (also not yet available in .NET yet). You'd need some process that periodically observes the queue size and sends it up.

Though, I suppose you'd lose the ability to add any context to the metrics from individual enqueue/dequeue operations.

alanwest avatar Mar 17 '22 21:03 alanwest

This is a super common scenario. I'd think OTel would have to be able to handle it, but I'm not seeing how by looking at the spec.

ejsmith avatar Mar 17 '22 23:03 ejsmith

Is it possible to have a non-Observable Gauge? Or would an UpDownCounter get the job done?

I have millions of actor mailboxes per process whose depth I would like to report upon when those actors are scheduled (only a few thousand at any given second) - using the observable gauge means millions of delegate allocations and presumably, a large number of background tasks.

Aaronontheweb avatar Mar 24 '22 15:03 Aaronontheweb

@Aaronontheweb I was thinking that same thing. It does seem like a non-observable gauge would be really useful in these scenarios, but surely the Otel people have thought about these common use cases.

ejsmith avatar Mar 24 '22 15:03 ejsmith

@Aaronontheweb I was thinking that same thing. It does seem like a non-observable gauge would be really useful in these scenarios, but surely the Otel people have thought about these common use cases.

There were discussions about sync version of Gauge in some Spec Meetings, but it didn't make to the final API. (the API is stable, but can take new instruments.) - I'd suggest to have this conversation in the specification repo, so as to get the right feedback.

cijothomas avatar Mar 24 '22 16:03 cijothomas

@cijothomas done - replied here https://github.com/open-telemetry/opentelemetry-specification/issues/2318

Aaronontheweb avatar Apr 06 '22 17:04 Aaronontheweb

Update:

https://www.nuget.org/packages/System.Diagnostics.DiagnosticSource/7.0.0-preview.3.22175.4 has added UpDownCounter. Will incorporate that into OpenTelemetry SDK/Exporters after 1.2 release is completed. (We won't be able to release stable version with UpDownCounter until Nov 2022, as that is when DS 7.0 stable is expected)

cijothomas avatar Apr 15 '22 02:04 cijothomas

Fixed via https://github.com/open-telemetry/opentelemetry-dotnet/pull/3606

cijothomas avatar Oct 07 '22 14:10 cijothomas