[exporter/prometheusremotewrite] Support Prometheus Remote-Write v2
Component(s)
exporter/prometheusremotewrite
Is your feature request related to a problem? Please describe.
Prometheus remote write v2 spec is close to being finalized, it would be good time to start working on implementing support for it in the prometheus remote write exporter component. Soon backend such as prometheus, mimir, thanos etc will start supporting it. It would great to start working on implementing to gain it's benefits,.
Describe the solution you'd like
I would like to contribute implementation of the remote write v2 https://github.com/prometheus/docs/pull/2462 into the prometheus remote write exporter, most likely behind a feature flag since long-term v2 version is supposed to replace the current implementation.
Describe alternatives you've considered
Potentially it could introduced as a separate component but I hope that won't be necessary.
Additional context
The current implementation of remote write v2 lives in it's own branch of prometheus, https://github.com/prometheus/prometheus/tree/remote-write-2.0 , but should be merged soon.
TODO
Translation layer:
- [x] Support for metadata https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40469
- [ ] Handle conflicts in metrics https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/39570
- [x] Handle target_info https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40493
- [ ] Handle exemplars, seems partially implemented in https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/37156
- [ ] Support for metric types
- [x] MetricTypeSum https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/37156
- [ ] MetricTypeHistogram https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40494
- [ ] MetricTypeExponentialHistogram
- [ ] MetricTypeSummary
- [ ] Double check this logic, it's not present in RW1 https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/1af0f49cb6610e4d2b9fc2e7cd9a162be3aafb28/pkg/translator/prometheusremotewrite/metrics_to_prw_v2.go#L98
- [ ] Figure out what to do with namespace added to
target_infoin both RW1 and RW2 https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40493#discussion_r2127246673
Exporter:
- [x] Get initial PR merged http://github.com/open-telemetry/opentelemetry-collector-contrib/pull/38235
- [x] Sort timeseries https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/399805
- [x] Implement batching https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40051
- [ ] batch symbolsTable instead of sending it whole
- [ ] Try sorting timeseries while batching??
- [ ] Check if
sizeOfSeries > maxBatchByteSizecould happen, this might affect PRW1 too https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40051#discussion_r2129649221
- [x] Add test case with invalid proto message https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/35888#discussion_r1840229263 https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40562
- [ ] Log warning when missing RW2 spec headers https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/40559
- [ ] Implement WAL support
- [ ] Go over the spec and confirm it follows it
- [ ] Add support for NoTranslation mode (similar to https://github.com/open-telemetry/opentelemetry-go/issues/6668)
- [ ] Look into generating metrics from Written headers as requested here https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/39799#issuecomment-2851611496
- [ ] Improve tests https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/c9e3f1ab8cc431ca4c2970e188d8575706e43df9/exporter/prometheusremotewriteexporter/exporter_test.go#L477
- [ ] Optionally add e2e tests?
- [ ] Add testing of otel inprometheus/compliance repo https://github.com/prometheus/compliance/issues/101
Snippet for later
symbols := converter.symbolTable.Symbols()
for _, tss := range converter.timeSeries() {
lbls := make([]prompb.Label, 0, len(tss.LabelsRefs)/2)
for i := 0; i < len(tss.LabelsRefs); {
lbls = append(lbls, prompb.Label{Name: symbols[tss.LabelsRefs[i]], Value: symbols[tss.LabelsRefs[i+1]]})
i += 2
}
assert.Equal(t, tc.wantLabels, lbls)
}
Pinging code owners:
- exporter/prometheusremotewrite: @Aneurysm9 @rapphil
See Adding Labels via Comments if you do not have permissions to add labels yourself.
Thanks for the proposal @jmichalek132, makes sense to me. Agreed that a feature gate is a good idea here. 👍
Interesting! How can I contribute to this feature? @crobert-1 ,@jmichalek132
Remote write 2.0 is still experimental in prometheus, however it will be enabled by default in the the upcoming version version 2.54.0.
Having said that I would welcome a contribution that will add support to PRW2.0. I think we can add a configuration option protocol_version that can be used by users to select what version of the remote write protocol they want to use. Valid values would be 1.0 and 2.0 with default value 1.0. Once Prometheus deprecate 1.0 we can plan on moving the default to 2.0.
Thanks @rapphil . Will look into the Prometheus remote write 2.0 and plan the changes with above said
cc @ArthurSens We were considering making this a CNCF intern project: https://github.com/cncf/mentoring/pull/1281
@dashpole Good to hear that this issue under CNCF Mentorship. Would like to contribute to this under LFX Mentorship.
What could be application process to be part of applying to this project? cc: @ArthurSens
Hi @varshith257, the application happens through the LFX website. There's a lot of instructions on their website :)
https://docs.linuxfoundation.org/lfx/mentorship/mentees
@ArthurSens Thanks for pointing me there :)
Right now will apply to this project. Any other tasks that has to be done before application apart from resume and cover letter?
@ArthurSens Thanks for pointing me there :)
Right now will apply to this project. Any other tasks that has to be done before application apart from resume and cover letter?
I don't think there's any other requirements :)
Hello! @ArthurSens @dashpole @aknuds1
I am Manul, backend Software engineer from India. I am interested to contribute in the development of RWv2 exporter and have applied under LFX mentorship 2024. I am trying to get myself familar with the Prometheus and OTEL through docs. I have successfully setup Prometheus locally for development.
While going through the RW2.0 specs, I learnt about the new Protobuf message specs io.prometheus.write.v2.Request. Also in the v2 specs sheet, multiple references are given to v1 docs to compare and explain repetitive reasons in FAQs, but it seems the link is not working or the docs are moved. So is it necessary to go through it?
Apart from these, what do you suggest to try on to learn/test, relavant to this feature and also more generic to get involved with active code contributions in both the organizations.
I think it should link to https://prometheus.io/docs/specs/remote_write_spec/. I would try out the existing PRW exporter with the prometheus server as a first step. There should be time to get familiar with the components after the internship starts if you are selected.
more generic to get involved with active code contributions in both organizations.
Most open-source projects have community meetings and guidelines somewhere in their repo:
https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CONTRIBUTING.md https://github.com/prometheus/prometheus/blob/main/CONTRIBUTING.md
These documents usually guide you in a good direction for code contributions :)
It's also a good idea to look for issues labeled as good first issue or help wanted. Communicating over slack (in the appropriate channels) usually speeds up the process as well, just keep in mind that we're all volunteers here and response times might not be great sometimes.
Ping @dashpole could you please assign this to me as I was selected for the lfx mentorship project? Thank you.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
- exporter/prometheusremotewrite: @Aneurysm9 @rapphil @dashpole @ArthurSens
See Adding Labels via Comments if you do not have permissions to add labels yourself.
I am interested in helping out with this effort
Can you add an item to the translator checklist to support NoTranslation mode (similar to https://github.com/open-telemetry/opentelemetry-go/issues/6668)
Reopening since this was closed by mistake!
@jmichalek132 , let's avoid adding Fixes #33661 to our PR descriptions :)
Reopening, closed by accident