opentelemetry-collector-contrib icon indicating copy to clipboard operation
opentelemetry-collector-contrib copied to clipboard

[exporter/prometheusremotewrite] Support Prometheus Remote-Write v2

Open jmichalek132 opened this issue 1 year ago • 18 comments

Component(s)

exporter/prometheusremotewrite

Is your feature request related to a problem? Please describe.

Prometheus remote write v2 spec is close to being finalized, it would be good time to start working on implementing support for it in the prometheus remote write exporter component. Soon backend such as prometheus, mimir, thanos etc will start supporting it. It would great to start working on implementing to gain it's benefits,.

Describe the solution you'd like

I would like to contribute implementation of the remote write v2 https://github.com/prometheus/docs/pull/2462 into the prometheus remote write exporter, most likely behind a feature flag since long-term v2 version is supposed to replace the current implementation.

Describe alternatives you've considered

Potentially it could introduced as a separate component but I hope that won't be necessary.

Additional context

The current implementation of remote write v2 lives in it's own branch of prometheus, https://github.com/prometheus/prometheus/tree/remote-write-2.0 , but should be merged soon.

TODO

Translation layer:

  • [x] Support for metadata https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40469
  • [ ] Handle conflicts in metrics https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/39570
  • [x] Handle target_info https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40493
  • [ ] Handle exemplars, seems partially implemented in https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/37156
  • [ ] Support for metric types
    • [x] MetricTypeSum https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/37156
    • [ ] MetricTypeHistogram https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40494
    • [ ] MetricTypeExponentialHistogram
    • [ ] MetricTypeSummary
    • [ ] Double check this logic, it's not present in RW1 https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/1af0f49cb6610e4d2b9fc2e7cd9a162be3aafb28/pkg/translator/prometheusremotewrite/metrics_to_prw_v2.go#L98
  • [ ] Figure out what to do with namespace added to target_info in both RW1 and RW2 https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40493#discussion_r2127246673

Exporter:

  • [x] Get initial PR merged http://github.com/open-telemetry/opentelemetry-collector-contrib/pull/38235
  • [x] Sort timeseries https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/399805
  • [x] Implement batching https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40051
    • [ ] batch symbolsTable instead of sending it whole
    • [ ] Try sorting timeseries while batching??
    • [ ] Check if sizeOfSeries > maxBatchByteSize could happen, this might affect PRW1 too https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40051#discussion_r2129649221
  • [x] Add test case with invalid proto message https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/35888#discussion_r1840229263 https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40562
  • [ ] Log warning when missing RW2 spec headers https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40559
  • [ ] Implement WAL support
  • [ ] Go over the spec and confirm it follows it
  • [ ] Add support for NoTranslation mode (similar to https://github.com/open-telemetry/opentelemetry-go/issues/6668)
  • [ ] Look into generating metrics from Written headers as requested here https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/39799#issuecomment-2851611496
  • [ ] Improve tests https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/c9e3f1ab8cc431ca4c2970e188d8575706e43df9/exporter/prometheusremotewriteexporter/exporter_test.go#L477
  • [ ] Optionally add e2e tests?
  • [ ] Add testing of otel inprometheus/compliance repo https://github.com/prometheus/compliance/issues/101

Snippet for later

			symbols := converter.symbolTable.Symbols()
			for _, tss := range converter.timeSeries() {
				lbls := make([]prompb.Label, 0, len(tss.LabelsRefs)/2)
				for i := 0; i < len(tss.LabelsRefs); {
					lbls = append(lbls, prompb.Label{Name: symbols[tss.LabelsRefs[i]], Value: symbols[tss.LabelsRefs[i+1]]})
					i += 2
				}
				assert.Equal(t, tc.wantLabels, lbls)
			}

jmichalek132 avatar Jun 19 '24 18:06 jmichalek132

Pinging code owners:

  • exporter/prometheusremotewrite: @Aneurysm9 @rapphil

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Jun 19 '24 18:06 github-actions[bot]

Thanks for the proposal @jmichalek132, makes sense to me. Agreed that a feature gate is a good idea here. 👍

crobert-1 avatar Jun 20 '24 15:06 crobert-1

Interesting! How can I contribute to this feature? @crobert-1 ,@jmichalek132

varshith257 avatar Jul 27 '24 03:07 varshith257

Remote write 2.0 is still experimental in prometheus, however it will be enabled by default in the the upcoming version version 2.54.0.

Having said that I would welcome a contribution that will add support to PRW2.0. I think we can add a configuration option protocol_version that can be used by users to select what version of the remote write protocol they want to use. Valid values would be 1.0 and 2.0 with default value 1.0. Once Prometheus deprecate 1.0 we can plan on moving the default to 2.0.

rapphil avatar Jul 29 '24 16:07 rapphil

Thanks @rapphil . Will look into the Prometheus remote write 2.0 and plan the changes with above said

varshith257 avatar Jul 29 '24 16:07 varshith257

cc @ArthurSens We were considering making this a CNCF intern project: https://github.com/cncf/mentoring/pull/1281

dashpole avatar Jul 30 '24 13:07 dashpole

@dashpole Good to hear that this issue under CNCF Mentorship. Would like to contribute to this under LFX Mentorship.

What could be application process to be part of applying to this project? cc: @ArthurSens

varshith257 avatar Jul 30 '24 14:07 varshith257

Hi @varshith257, the application happens through the LFX website. There's a lot of instructions on their website :)

https://docs.linuxfoundation.org/lfx/mentorship/mentees

ArthurSens avatar Aug 05 '24 18:08 ArthurSens

@ArthurSens Thanks for pointing me there :)

Right now will apply to this project. Any other tasks that has to be done before application apart from resume and cover letter?

varshith257 avatar Aug 05 '24 18:08 varshith257

@ArthurSens Thanks for pointing me there :)

Right now will apply to this project. Any other tasks that has to be done before application apart from resume and cover letter?

I don't think there's any other requirements :)

ArthurSens avatar Aug 07 '24 12:08 ArthurSens

Hello! @ArthurSens @dashpole @aknuds1 I am Manul, backend Software engineer from India. I am interested to contribute in the development of RWv2 exporter and have applied under LFX mentorship 2024. I am trying to get myself familar with the Prometheus and OTEL through docs. I have successfully setup Prometheus locally for development. While going through the RW2.0 specs, I learnt about the new Protobuf message specs io.prometheus.write.v2.Request. Also in the v2 specs sheet, multiple references are given to v1 docs to compare and explain repetitive reasons in FAQs, but it seems the link is not working or the docs are moved. So is it necessary to go through it?

Apart from these, what do you suggest to try on to learn/test, relavant to this feature and also more generic to get involved with active code contributions in both the organizations.

manulpatel avatar Aug 24 '24 07:08 manulpatel

I think it should link to https://prometheus.io/docs/specs/remote_write_spec/. I would try out the existing PRW exporter with the prometheus server as a first step. There should be time to get familiar with the components after the internship starts if you are selected.

dashpole avatar Aug 29 '24 18:08 dashpole

more generic to get involved with active code contributions in both organizations.

Most open-source projects have community meetings and guidelines somewhere in their repo:

https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CONTRIBUTING.md https://github.com/prometheus/prometheus/blob/main/CONTRIBUTING.md

These documents usually guide you in a good direction for code contributions :)

It's also a good idea to look for issues labeled as good first issue or help wanted. Communicating over slack (in the appropriate channels) usually speeds up the process as well, just keep in mind that we're all volunteers here and response times might not be great sometimes.

ArthurSens avatar Aug 29 '24 19:08 ArthurSens

Ping @dashpole could you please assign this to me as I was selected for the lfx mentorship project? Thank you.

jmichalek132 avatar Sep 26 '24 13:09 jmichalek132

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

  • exporter/prometheusremotewrite: @Aneurysm9 @rapphil @dashpole @ArthurSens

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Jan 13 '25 03:01 github-actions[bot]

I am interested in helping out with this effort

ywwg avatar May 02 '25 13:05 ywwg

Can you add an item to the translator checklist to support NoTranslation mode (similar to https://github.com/open-telemetry/opentelemetry-go/issues/6668)

ywwg avatar May 02 '25 19:05 ywwg

Reopening since this was closed by mistake!

@jmichalek132 , let's avoid adding Fixes #33661 to our PR descriptions :)

ArthurSens avatar May 14 '25 17:05 ArthurSens

Reopening, closed by accident

ArthurSens avatar Aug 08 '25 23:08 ArthurSens