opentelemetry-collector-contrib icon indicating copy to clipboard operation
opentelemetry-collector-contrib copied to clipboard

[k8sclusterreceiver] refactoring pod status phase

Open povilasv opened this issue 2 years ago • 17 comments

Component(s)

receiver/k8scluster

Describe the issue you're reporting

We currently encode pod status phase to number:

  k8s.pod.phase:
    enabled: true
    description: Current phase of the pod (1 - Pending, 2 - Running, 3 - Succeeded, 4 - Failed, 5 - Unknown)
    unit: 1
    gauge:
      value_type: int

I think it's preferable to split into many metrics:

  k8s.pod.phase_pending:
    enabled: true
    description: Whether this pod phase is Pending (1), or not (0).
    unit: 1
    gauge:
      value_type: int
  k8s.pod.phase_running:
    enabled: true
    description: Whether this pod phase is Running (1), or not (0).
    unit: 1
    gauge:
      value_type: int

IMO it's easier to parse from the name what is happening and we already have this pattern for Nodes conditions.

Thoughts?

povilasv avatar Jul 21 '23 11:07 povilasv

Pinging code owners:

  • k8sclusterreceiver: @dmitryax
  • receiver/k8scluster: @dmitryax

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Jul 21 '23 11:07 github-actions[bot]

I see your point, but I have a couple of counter-arguments:

  1. This will introduce 5x bigger payload to deliver the same information.
  2. When you look at the chart, it's pretty convenient to see what states the pod has been in over time using just one time series. Some UIs can let you assign the states to the values on the chart, which is very convenient. Using several metrics for that is not that handy.

If a user wants this particular representation, I believe it can be translated with the metrics transform processor. Also, most backends likely have a query language capable of representing this metric as 5 "boolean" time series.

I wouldn't like to make this change until we have strict guidelines from the OTel Spec on how to represent this kind of data

dmitryax avatar Jul 21 '23 20:07 dmitryax

Thanks for reply. I agree with you, the k8s.pod.phase is way more compact and in theory transformProcessor should work. I will play with it to see if it does in practice.

povilasv avatar Jul 24 '23 12:07 povilasv

When you look at the chart, it's pretty convenient to see what states the pod has been in over time using just one time series. Some UIs can let you assign the states to the values on the chart, which is very convenient.

Can you provide a screenshot/example?

Also, most backends likely have a query language capable of representing this metric as 5 "boolean" time series.

Example? I can't get this to work in either of the monitoring vendors my company uses.

Since the actual numeric values are meaningless, I can't see how to make this work in a when aggregating across dimensions. Many monitoring tools also don't let you filter on metric values, just tags (i.e. you can't say WHERE value = 2). For example, if I'd like to see the number of pods that are in Running phase in a DaemonSet, how do I do that?

I think an OpenMetrics StateSet would be the more idiomatic way to model this information (and is also supported by OTel).

This is also how kube-state-metrics does it with kube_pod_status_phase. Seems to work quite well in PromQLcount(kube_pod_status_phase{phase="Running"})

sirianni avatar Aug 04 '23 19:08 sirianni

I believe it can be translated with the metrics transform processor

@dmitryax can you give a rough example of how you would use metricstransform processor to do this transformation? From what I see there's no way to "match" based on values and aggregating the "numeric enums" don't make sense either.

sirianni avatar Aug 07 '23 19:08 sirianni

I agree with @sirianni.

OOT, k8s.pod.phase metric should keep the same name (don't split it in k8s.pod.phase_pending, k8s.pod.phase_running, etc.), but it should have a state attribute, to be consistent with other similar enumerations in other metrics (although I don't think state is specified anywhere in Otel's semantic conventions).

So you'll push several datapoints to represent the current "running" state of the pod:

  • k8s.pod.phase{state="running"} 1
  • k8s.pod.phase{state="pending"} 0
  • k8s.pod.phase{state="succeeded"} 0
  • k8s.pod.phase{state="failed"} 0
  • k8s.pod.phase{state="unknown"} 0

Not ultra optimal (cardinality), but it does the job, and it's easily query-able.

Also, it's a good incentive for OpenTelemetry to specify the StateSet metric type! 😉

bertysentry avatar Sep 22 '23 08:09 bertysentry

Take a look at the hw.status metric in OTel Semantic Conventions which uses the value-as-attribute approach I'm suggesting. In particular, see this note

hw.status is currently specified as an UpDownCounter but would ideally be represented using a StateSet as defined in OpenMetrics. This semantic convention will be updated once StateSet is specified in OpenTelemetry. This planned change is not expected to have any consequence on the way users query their timeseries backend to retrieve the values of hw.status over time.

image

sirianni avatar Nov 06 '23 14:11 sirianni

We discussed this issue in the SIG meeting today. We decided we want the Spec to make a decision on StateSet before we'd move forward with any changes. I will try to find an open issue in the Spec repo and create one to start the conversation if I can't find one.

TylerHelmuth avatar Nov 15 '23 17:11 TylerHelmuth

There is also an open question on where this metric should live. @sirianni can you provide more details on the scaling issue you've experienced?

TylerHelmuth avatar Nov 15 '23 17:11 TylerHelmuth

There is also an open question on where this metric should live. @sirianni can you provide more details on the scaling issue you've experienced?

Sure. We run several large Kubernetes clusters (over 1000 nodes, over 12,000 pods) where using a single cluster-wide collector runs into scalability issues (scrape timeouts, etc.). Here is the Datadog Agent PR I referenced in today's SIG call where Airbnb mentions similar scalability issues using a cluster-wide collector (i.e. kube-state-metrics) for this data.

Collecting this information from the kubelet via a DaemonSet scales incrementally as the cluster grows.

sirianni avatar Nov 15 '23 17:11 sirianni

Link to slack discussion mentioned in the meeting: https://cloud-native.slack.com/archives/C01NP3BV26R/p1623952435074700

dashpole avatar Nov 15 '23 17:11 dashpole

Reading through this issue again is there actually anything we need from the spec? This link (https://opentelemetry.io/docs/specs/otel/compatibility/prometheus_and_openmetrics/#stateset) feels like it is pretty clear about how we handle statesets in otel metrics.

TylerHelmuth avatar Nov 15 '23 19:11 TylerHelmuth

That is for how to convert from openmetrics stateset metrics to OTLP. The idea discussed in the slack thread was to make the stateset pattern (or something close to it) a recommended pattern for OTel instrumentation. I would like to see, at minimum, the specification recommend using labels, rather than values to represent enums (i.e. foo{state=bar} 1, rather than foo_state_bar 1 or foo 4, where value 4 implies state==bar)

dashpole avatar Nov 15 '23 19:11 dashpole

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

  • receiver/k8scluster: @dmitryax @TylerHelmuth @povilasv

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Jan 15 '24 03:01 github-actions[bot]

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

  • receiver/k8scluster: @dmitryax @TylerHelmuth @povilasv

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Mar 18 '24 03:03 github-actions[bot]

Ran into another example of some confusion/frustration recently

jaronoff97 avatar Apr 26 '24 16:04 jaronoff97

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

  • receiver/k8scluster: @dmitryax @TylerHelmuth @povilasv

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Jun 27 '24 03:06 github-actions[bot]

This issue has been closed as inactive because it has been stale for 120 days with no activity.

github-actions[bot] avatar Aug 26 '24 05:08 github-actions[bot]