opentelemetry-collector-contrib icon indicating copy to clipboard operation
opentelemetry-collector-contrib copied to clipboard

groupbyattrsprocessor drops metric metadata

Open braydonk opened this issue 1 year ago • 11 comments

Component(s)

processor/groupbyattrs

What happened?

Description

When groupbyattrsprocessor makes a new metric, it does not copy the metric metadata. I assume this is because Metadata is a relatively new field and isn't fully respected everywhere yet.

Steps to Reproduce

Found this by testing the new untyped metric support in Prometheus. It adds a Metadata key called prometheus.type. So that's the easiest way to see the effect. Create a pipeline from prometheusreceiver to groupbyattrsprocessor to debugexporter and have it scrape some manner of metrics.

Expected Result

The prometheus.type metadata should still be present when the value is seen in debugexporter.

Actual Result

It's gone.

Other notes

Collector version

v0.102.0

Environment information

Environment

OS: Debian 12 Compiler(if manually compiled): go 1.22.3

OpenTelemetry Collector configuration

No response

Log output

Nothing of note.

Additional context

Should there be an actual API in pdata for making full metric copies like this? What the groupbyattrsprocessor has to do here is pretty brittle for exactly this reason, and I'm not sure if there are other processors doing something similar.

braydonk avatar Jun 06 '24 19:06 braydonk

Pinging code owners:

  • processor/groupbyattrs: @rnishtala-sumo

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Jun 06 '24 19:06 github-actions[bot]

FYI @dashpole @ridwanmsharif

braydonk avatar Jun 06 '24 19:06 braydonk

There is a CopyTo function for metrics, but i'm not sure if that is what is needed here.

dashpole avatar Jun 06 '24 19:06 dashpole

For reference, this is the spot that does not copy metadata: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/dd2e45aed3797b14a7b9723c5064141553004566/processor/groupbyattrsprocessor/processor.go#L204-L208

braydonk avatar Jun 06 '24 19:06 braydonk

@braydonk looks like this could done using - https://pkg.go.dev/go.opentelemetry.io/collector/[email protected]/pmetric#Metric.CopyTo

as suggested previously. I think the ask here makes sense. Please go ahead If you'd like to make the change. I can review the changes when ready.

rnishtala-sumo avatar Jun 10 '24 21:06 rnishtala-sumo

Removing needs triage based on code owner feedback.

crobert-1 avatar Jun 10 '24 22:06 crobert-1

If nobody is working on this issue, I would like to take it cc @evan-bradley

odubajDT avatar Jun 25 '24 09:06 odubajDT

@braydonk looks like this could done using - https://pkg.go.dev/go.opentelemetry.io/collector/[email protected]/pmetric#Metric.CopyTo

as suggested previously. I think the ask here makes sense. Please go ahead If you'd like to make the change. I can review the changes when ready.

After some investigation, using Metric.CopyTo() function is problematic here. The function copies also the datapoints from the original metric which goes exactly against the purpose of this processor, where datapoints are moved in between metrics. Therefore when creating new metric here (with empty datapoint slice) and using the CopyTo() function with it will lead to adding datapoints which are in the input and should not be present on output.

Therefore I would suggest adding the missing CopyTo() function only for metadata here unless we want to do a major refactoring of the code of the processor.

I am opened for suggestions

odubajDT avatar Jun 26 '24 09:06 odubajDT

Thanks for taking on this issue!

Might be naive, but perhaps the datapoints could just be deleted from the copy of the metric? If that won't work then it's fine to just do the metadata copy. Just would be nice if the full metric CopyTo function would work just to avoid this kind of thing happening again.

braydonk avatar Jun 26 '24 20:06 braydonk

Thanks for taking on this issue!

Might be naive, but perhaps the datapoints could just be deleted from the copy of the metric? If that won't work then it's fine to just do the metadata copy. Just would be nice if the full metric CopyTo function would work just to avoid this kind of thing happening again.

This should work, but when deleting datapoints, we still need to first find out, what type are we dealing with (sum/gauge/histogram...) and firstly then delete the appropiate datapoints (these are also different types - numeric value/ histogram value/...).

Therefore I do not see any improvement in the logic here, since still the logic determining the type for the metric needs to stay in place and instead of creating empty metrics, we will copy them and them delete parts of them.

odubajDT avatar Jun 27 '24 08:06 odubajDT

Ah okay I understand the issue now. Probably fine to just use the metadata copy then; it's probably sufficiently rare for that metric proto to change much for this kind of thing to happen again. Thanks!

braydonk avatar Jun 27 '24 13:06 braydonk

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

  • processor/groupbyattrs: @rnishtala-sumo

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Sep 02 '24 03:09 github-actions[bot]

Fixed by https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/33781

dashpole avatar Sep 03 '24 14:09 dashpole