vector
vector copied to clipboard
New `opentelemetry` source and sink
OpenTelemetry is a specification for collecting observability data.
Their collector and libraries are of questionable quality. We'd like Vector to support OT through their various protobufs and become the best OT collector.
We should break this down into smaller tasks, likely around their various data type (logs, metrics, and traces). I'd like to start with tracing, if possible, to introduce that type into our data model.
- [x] #13551
- [ ] Support ingesting OpenTelemetry metrics
- [ ] Support ingesting OpenTelemetry traces
- [ ] #13622
- [ ] Support sending OpenTelemetry metrics
- [ ] Support sending OpenTelemetry traces
Looks relevant #576
As OpenTracing merged with OpenCensus in one OpenTelemetry project it is considered to support that feature on in and / out ?? It is important to make a Vector replacement for Datadog on the Tracing layer. https://www.datadoghq.com/blog/opentelemetry-instrumentation/ and may also help to build one layer for logs, metrics, and traces. This also may help to build an architecture that is not vendor locked and allow to switch to other providers easily.
Thanks @szibis. Agree, that's the idea with thee OpenTelemetry components. We also want them to enforce our tracing data model when we start to implement it.
This also may help to build an architecture that is not vendor locked and allow to switch to other providers easily.
Agree! That's the primary idea behind Vector. Although, Vector wants to acknowledge current state and help users migrate towards open standards.
Hi! We have been using Vector for a while as a log exporter (stdout -> AWS Kinesis -> ElasticSearch) while we have a separate pipeline for OpenTelemetry traces (application pushes to a Jaeger collector). We were considering the option of moving to Honeycomb for our observability needs and I noticed Vector provides a honeycomb sink, but it does not provide any OpenTelemetry source. Would we therefore be losing information by using stdout as source and honeycomb as sink compared to pushing our OpenTelemetry data directly into the OpenTelemetry collector and using that to push the data into Honeycomb?
I'd prefer to have a single agent for all our telemetry needs, but it'd be interesting to understand better :eyes:
Any update? I see that tasks related with opentelemetry were removed from different milestones?
@kaarolch we are also currently planning on adding OpenTelemetry support to Vector by the end of the year. This should be more definite soon.
@kaarolch we are also currently planning on adding OpenTelemetry support to Vector by the end of the year. This should be more definite soon.
That sounds great! I found this work will start soon at next month - Vector Public Roadmap. I would like to see details of design about opentelemetry
source and sink.
Sure thing. We'll likely be posting RFCs for this work before any work starts.
@binarylogic since 2 years has passed, Do you still thing otel collector is of questionable quality ?. I actually see it as much more flexible than vector , for example sampling strategy .
This is still on our near term roadmap. We didn't get to it this quarter like we expected, but anticipate working on it in Q1.
Is this something that you'd be willing to accept open source contributions for? This is fairly major work and in theory close to being worked on, so I'd understand if you said no. I was personally considering setting up an opentelemetry sink for metrics.
I found the Tracing support RFC hasn't complete yet, willing to join discussion about this work.
This is something a team member is planning to pick up this quarter. If priorities shift and we aren't able to get to it, we'd be happy to see a PR for it. We are still in the process of adding support to Vector's internal data model for traces (https://github.com/vectordotdev/vector/pull/10483). That will need to happen first.
I was curious and checking if this existed yet and found this issue.
Please post to this issue if your priorities shift -- and if anyone else starts working on this please let it be known here :). I ask this because I'm always looking for more ways to learn more Rust and reasons to bother @blt :) -- I already work on OpenTelemetry, so thought this would be a good way to cover both those, assuming I can find the time myself.
I ask this because I'm always looking for more ways to learn more Rust and reasons to bother @blt :)
:wave:
😄 will do. This issue will be assigned if we start work on it.
@jszwedko is this issue still on Vector's team radar? Looks like greatly deprioritized multiple times since 2019. Thanks
@jszwedko is this issue still on Vector's team radar? Looks like greatly deprioritized multiple times since 2019. Thanks
We have this scheduled for the upcoming quarter - starting with traces and going from there based on the state/stability of the event type 👍
Any updates? 🙏
Any updates? 🙏
This is likely to be on our roadmap for Q3.
We have the use cases to collect logs with Vector and send them to various sinks including OTLP-compatible endpoints. Vector's support for OTLP would be fantastic. :)
@jszwedko
Any updates? :pray:
This is likely to be on our roadmap for Q3.
Any % you can put on how likely "likely" is? I'm planning out an observability pipeline project and I'd prefer to use Vector over Opentelemetry Collector for VRL and the Lua transform, but this has been repeatedly pushed down the Vector team's priority list so it's hard to plan around :-)
Edit: Oh I just saw that a logs source is possibly getting close to merged in #13320 which is a great start on this. Makes it seem more real!
Any % you can put on how likely "likely" is? I'm planning out an observability pipeline project and I'd prefer to use Vector over Opentelemetry Collector for VRL and the Lua transform, but this has been repeatedly pushed down the Vector team's priority list so it's hard to plan around :-)
Edit: Oh I just saw that a logs source is possibly getting close to merged in #13320 which is a great start on this. Makes it seem more real!
We discussed roadmaps at the end of last week and the plan is for me to work on sources and sinks for logs/metrics/traces all quarter long.
We discussed roadmaps at the end of last week and the plan is for me to work on sources and sinks for logs/metrics/traces all quarter long.
Hi @spencergilbert , r there any task lists that we can pick up to accelerate this process?
Hey @caibirdme, it sounded like you were interested in OTel logs sink support? If that's the case I opened https://github.com/vectordotdev/vector/issues/13622 to track and discuss.
If you'd prefer to work on a different portion of the source or sink, that's fine too. Just let me know!
Hey @caibirdme, it sounded like you were interested in OTel logs sink support? If that's the case I opened #13622 to track and discuss.
If you'd prefer to work on a different portion of the source or sink, that's fine too. Just let me know!
Yes, I'm interested in that because we eagerly want this feature in our system. We're building a new observability system based on opentelemetry and clickhouse. After supporting the opentelemetry source, we can ingest log(otel and non-otel log), uniform those data by vrl and sink log to the clickhouse. But the problem is that the vector/clickhouse-sink which based on http protocol & JSONEachRow Format, is ineffecient and incurs heavy overhead on clickhouse. There're two ways to solve this:
- optimize clickhouse sink by switching to clickhouse native protocol or support more effecient format such as apache arrow or parquet
- support otel sink, export log to opentelemetry-collector which contains a high performance clickhouse exporter(the reason we do not using otel-collector directly is that our logs are in different format, we need vrl to uniform them)
- optimize clickhouse sink by switching to clickhouse native protocol or support more effecient format such as apache arrow or parquet
I'd recommend opening an issue regarding the clickhouse
sink, if you haven't/if there isn't one already. It's definitely out of scope for this issue/my current work - we'd be happy to look into improvements or review a proposed contribution to address the issues you've had.
It sounds like for your use-case it would ultimately be better to improve the clickhouse
sink so you could drop an additional tool from your pipeline, but in the end it's definitely more about what and where you'd like to contribute.
Is the work for this sliced up into actionable pieces somewhere that I can look at? Would like to contribute to this
Is the work for this sliced up into actionable pieces somewhere that I can look at? Would like to contribute to this
Hi @cetanu ! Our nearterm focus is on the source: adding support for ingesting metrics and traces. If you were interested in contributing to this, creating a new opentelemetry
sink to send data via OTLP would likely be a good place to start.
Hi, is there any one done benchmark or test on otel-collector lately? Since @binarylogic 's words posted near 3 years ago - "Their collector and libraries are of questionable quality" , a benchmark or test on current version of otel-collector will be very supportive, If any one has done this, could you share it out(or leave your conclusion if it can not be shared) ?
update: I found a benchmark report of aws-otel-collector with version v0.21.0 (Latest on 2022-9-5), for u ref: https://aws-observability.github.io/aws-otel-collector/benchmark/report
I found vector has issue related for this with still open status(2022-1-15): https://github.com/vectordotdev/vector/issues/13132