sentry-python icon indicating copy to clipboard operation
sentry-python copied to clipboard

Kafka Integration

Open smeubank opened this issue 2 years ago • 4 comments

Problem Statement

There is no integration in the Python SDK which provides some auto-instrumentation for kafka interactions. Which requires manual instrumentation to be able to capture span information and have OOTB distributed tracing with services communicating

Solution Brainstorm

There is already a redis integration, which is not immediately the same as Kafka but also can behave as a messaging service.

There is also in OTel some Kafka integration. would it be possible to have something similar or re-use their integration in the Sentry SDK?

smeubank avatar Sep 12 '23 09:09 smeubank

I tried building this into arroyo (where this IMO can be more effective), and IMO our transactions concept does not map particularly well onto any kafka concepts. there is no affordance for batching in our datamodel or OTEL's

error reporting works fine OOTB

untitaker avatar Jan 26 '24 15:01 untitaker

we could still technically inject headers while publishing and continue traces while subscribing though right, or is that bad?

sl0thentr0py avatar Jan 26 '24 15:01 sl0thentr0py

we could still technically inject headers while publishing and continue traces while subscribing though right, or is that bad?

that part works perfectly fine. the issue is that business logic in arroyo is usually written in a way where messages are processed in batches. so one function call processes 200 messages at once.

logically it means that there should be a transaction measuring that function's execution time (normalized by batch size?) -- but there's multiple traceparents to consider

we can make tracing work for consumers who just don't batch (we have a lot of them in sentry), but those tend to not be performance sensitive

i was hoping to integrate arroyo closer with DDM at some point since we already have metrics

untitaker avatar Jan 26 '24 15:01 untitaker

Getting a working Kafka Integration would allow us to remove the start_transaction call in src/sentry/ingest/consumer/processors.py, which we have decided to keep around in https://github.com/getsentry/sentry/issues/63590, since there is no auto-instrumentation to replace the custom start_transaction call without a Kafka integration.

szokeasaurusrex avatar Jan 26 '24 15:01 szokeasaurusrex