opentelemetry-collector-contrib icon indicating copy to clipboard operation
opentelemetry-collector-contrib copied to clipboard

New component: RabbitMQ Exporter

Open swar8080 opened this issue 1 year ago • 12 comments

Overiew

Use-cases

Similar use cases as other durable messaging system exporters like Kafka and Pulsar. This could help meet teams where they are that want to do custom telemetry processing but only have access to RabbitMQ

Why prioritize RabbitMQ and not some other queue?

  • RabbitMQ appears to be the second-most popular open-source message queue behind kafka based on google searches, market share, dockerhub downloads, etc.
  • Datadog's Vector telemetry pipeline supports RabbitMQ, as well as Kafka/SQS, but not any other queues
  • RabbitMQ actively maintains a Go client library using the AMQP 0.9.1 Protocol

Other options considered

SQS SQS is also popular but i'm assuming someone from AWS would need to own/maintain the component

AMQP 1.0 Protocol This would allow supporting other queues like ActiveMQ with a Go client library maintained by microsoft. However:

  • RabbitMQ users have to install a plugin to use AMQP 1.0, whereas AMQP 0.9.1 is the primary protocol supported
  • Maintenance, testing, and configuration could be more complicated having to support multiple queues with the same component

STOMP Protocol STOMP would also allow supporting other queues like ActiveMQ, however:

  • The Go client library seems to be maintained by an individual
  • Maintenance, testing, and configuration could be more complicated having to support multiple queues with the same component

JMS A JMS exporter was requested in https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/27258 because many queues support it

However, JMS is an API and not a wire-protocol. The implementation of the protocol seems specific to each queue

Example configuration for the component

Unique Configuration

  • publisher
    • routing_key (default = otlp_spans for traces, otlp_metrics for metrics, otlp_logs for logs): The AMQP routing key for the message, which will be delivered to a queue with that name (using the amq.direct exchange)
    • confirm_mode (default = true) whether to wait for confirmation that RabbitMQ successfully received or is unable to process a message. This improves the accuracy of collector metrics on unprocessed data. The tradeoff is lower throughput having to wait for asynchronous confirmation.
    • durable (default = true) whether to instruct RabbitMQ to durably persist messages on disk. When publisher.confirm_mode is true, this may delay confirmation by a few hundred milliseconds , decreasing the pipeline's throughput.
  • endpoint (default =rabbit://localhost:5672): The url of the RabbitMQ broker.

Common Configuration

The below is copied from the Pulsar exporter since it seems relevant to this exporter as well

  • auth / tls settings (TODO)
  • encoding of messages set to RabbitMQ (TODO, need to research current OTEL best-practices)
  • timeout: timeout for sending an individual message
  • connection_timeout: timeout for the establishing a connection to the broker and creating an AMQP channel
  • retry_on_failure
    • enabled:
    • initial_interval: Time to wait after the first failure before retrying; ignored if enabled is false
    • max_interval (default = ?): Is the upper bound on backoff; ignored if enabled is false
    • max_elapsed_time (default = ?): Is the maximum amount of time spent trying to send a batch; ignored if enabled is false
  • sending_queue
    • enabled (default = true)
    • num_consumers: Number of consumers that dequeue batches; ignored if enabled is false
    • queue_size Maximum number of batches kept in memory before dropping data; ignored if enabled is false; User should calculate this as num_seconds * requests_per_second where:
      • num_seconds is the number of seconds to buffer in case of a backend outage
      • requests_per_second is the average number of requests per seconds.

Telemetry data types supported

Logs, metrics, and traces

Is this a vendor-specific component?

  • [ ] This is a vendor-specific component
  • [ ] If this is a vendor-specific component, I am proposing to contribute and support it as a representative of the vendor.

Code Owner(s)

@swar8080

Sponsor (optional)

@atoulme

Additional context

Is this considered a vendor-specific component that needs to be implemented/maintained by the RabbitMQ team? If it's not then i'm happy to implement this!

There's some more research needed for the design but i'll wait to see if this accepted/sponsored before going down that rabbit hole. Let me know if any extra info would help though

swar8080 avatar Nov 05 '23 16:11 swar8080

Sounds like a valid use case and component to me! Please review all the requirements of adding a new component.

From here you'll need a sponsor to be able to move forward. You can join the Collector sig meetings and add this issue to the agenda of the Google doc to get more attention here.

As shared in another component proposal, not all components end up being sponsored, so feel free to go ahead and implement this in your own repository if you're not able to get much traction here soon.

Thanks for the proposal and willingness to contribute!

crobert-1 avatar Nov 28 '23 18:11 crobert-1

Confirm this is not a vendor-specific component, from what I can tell.

atoulme avatar Nov 29 '23 17:11 atoulme

How much of this exporter would you be able to calc after the kafka exporter? Ideally it could be a thin client and use encoding extensions to do most of the heavy lifting, making maintenance easier.

atoulme avatar Nov 29 '23 17:11 atoulme

Thanks @crobert-1 and @atoulme. I was assuming this wouldn't get sponsored and started implementing it just as a learning exercise. Here's what I have so far. This implementation might have optimizations and configuration options that aren't worth the complexity for an alpha component though.

I'll list out the possible scope to help decide if it's worth sponsoring/maintaining. From there I could break the implementation into smaller tasks / pull requests. Lmk what you suggest

Possible MVP

  • Message encoding logic (which I can likely re-use from other exporters like kafka)
  • "Fire-and-forget" messaging semantics that assumes the user has the right queues already configured. This would be with confirm_mode=false, meaning the collector doesn't wait for asynchronous confirmation that the broker received the message.
  • Standard exporter retry/timeout/queue configuration
  • Custom code to restore unhealthy connections to RabbitMQ since the client library doesn't have this (already implemented)
  • Handle RabbitMQ's form of back pressure

Other possible enhancements

  • Wait for asynchronous confirmation that RabbitMQ got the message (already implemented)
  • Support automatic queue creation, or handle asynchronously returned messages that are unroutable as errors
  • Re-use of AMQP channels (i.e. logical connections) to avoid making a few network calls during each batch. This saves ~50ms per batch when locally connecting to an AWS queue in the nearest region. Already implemented but it's likely a premature optimization.

swar8080 avatar Nov 29 '23 22:11 swar8080

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

github-actions[bot] avatar Jan 29 '24 03:01 github-actions[bot]

@swar8080 I'd be happy to sponsor this component and help it land in contrib.

atoulme avatar Mar 06 '24 17:03 atoulme

Is this the Exporter Part of https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/10592? The existing RabbitMQ receiver can't be used to receive the exported data, right or am I missing something?

romerod avatar Apr 09 '24 22:04 romerod

@romerod yep, same idea as the issue you mentioned. This component is for sending telemetry to rabbitmq. The rabbitmqreceiver is for collecting rabbitmq usage metrics

swar8080 avatar Apr 10 '24 00:04 swar8080

Thanks @swar8080, understood this, but which receiver can be used to receive the telemetry on the receiving side?

romerod avatar Apr 10 '24 21:04 romerod

@romerod gotcha, so there's no component for pulling messages from rabbitmq. This component is just for pushing messages to rabbitmq

swar8080 avatar Apr 10 '24 22:04 swar8080

This is an exciting component! I am in a similar situation like @romerod I'd be interested in using RabbitMQ queues that are distributed around the edge with this exporter to export OTEL in a fashion that decouples the direct OTLP protocol connection. And then I want to have a central location that goes to all the "edge" locations and collects the OTEL data from each RabbitMQ instance, for which there currently is no rabbitmq receiver.

cwegener avatar May 10 '24 02:05 cwegener

This exporter is a good solution for our IoT edge solution. I have tried to setup the rabbitmq exporter with version 0.103.0 without luck It is not part of the valid exporters. Maybe I did something wrong. If not, when this exporter will be part of the distribution?

gauthierjf avatar Jun 28 '24 17:06 gauthierjf

Hi @gauthierjf, rabbitmq exporter is available starting in 0.104.0

swar8080 avatar Jul 02 '24 12:07 swar8080

@swar8080: Since this component has now been shipped should we close this issue?

crobert-1 avatar Jul 02 '24 15:07 crobert-1

Is there a proposal to enable logs for rabbitmq receiver? I'd imagine that would produce a log per each message optionally including the body with a size limit.

alrz avatar Jul 20 '24 16:07 alrz