dd-trace-rb icon indicating copy to clipboard operation
dd-trace-rb copied to clipboard

Fix Karafka framework patch

Open Drowze opened this issue 2 weeks ago • 3 comments

What does this PR do?

Fixes the automatic WaterDrop integration activation (via the Karafka integration).

For context, by the time that Karafka::App#initialized! is called, Karafka.producer was already initialized. Not only that, there's really nowhere we can hook in the Karafka initialization where we can be sure that Karafka.producer wasn't yet initiaized - that's because the Karafka initialization is NOT necessarily tied to the WaterDrop initialization. For instance, the following is a possible scenario:

$producer = WaterDrop::Producer.new { ... }

Datadog.configure { |c| c.tracing.instrument :karafka }

# note that the producer was initialized before the Karafka app
# (and in this case, even before datadog was configured)
Karafka::App.setup { |c| c.producer = $producer }

So instead of trying to hook somewhere before Karafka.producer is initialized, let's simply listen to a Karafka after-initialization event and append our middleware to the producer when that event is triggered.

Motivation: We've noticed that WaterDrop distributed tracing was not working after configuring only the Karafka integration.

Change log entry

Additional Notes:

How to test the change?

Drowze avatar Dec 08 '25 18:12 Drowze

Should we be dropping the waterdrop min version in this PR?

ericfirth avatar Dec 12 '25 13:12 ericfirth

Should we be dropping the waterdrop min version in this PR?

@ericfirth Not sure about that 🤔

Short answer: I'm fine with that, but this decision has to come from the Datadog team. (and while I don't see a problem, tbh I'd prefer to not work on that support myself... I'm a bit burned from the slowness/difficulty of moving these Karafka-related pull requests forward. I'd be happy to review such work though)


Longer answer:

When I did the initial PR to add the WaterDrop integration I've added support to version >= 2.6.12, see: https://github.com/DataDog/dd-trace-rb/blob/6b654d43b79c4db7bac6abb62b19b5134e30566a/appraisal/ruby-3.4.rb#L147

The problem we had before version 2.8.8.rc1 is that there wasn't a proper API for attaching a middleware right after a producer is instantiated. So up until that point, I was simply patching the default WaterDrop monitor (as you can see here). (please note however: I was patching the default monitor, but the monitor can be swapped by the user as mentioned in the Karafka documentation (although WaterDrop isn't directly mentioned in that page, WaterDrop producers also have their monitor instance configurable))

@mensfeld (Karafka's maintainer) noted the struggle and added class-level notification events on version 2.8.8.rc1 (related WaterDrop docs), i.e.:

producer = WaterDrop::Producer.new { |c|  c.kafka = {"bootstrap.servers": "redpanda:9092"} }

# (before 2.8.8.rc1) events issued: none
# (since 2.8.8.rc1) events issued: 'producer.created', 'producer.configured'

Given that change on WaterDrop, @marcotc reviewed the PR:

👋 @Drowze, with the super fast improvements that @mensfeld did (thank you so much!), do your use cases need to instrument older versions of WaterDrop? In other words, is it reasonable to assume that upgrading WaterDrop is not a meaningful hurdle? I see that WaterDrop v2 has been around since 2021, so I'd assume the vast majority of users are on the v2 series, thus the upgrade would be non-breaking.

I'm trying to save your time, and keep only the nice new instrumentation, using the newly exposed API, if possible.

I wasn't too sure about the change being requested and argued a bit on my reply to that but, as I did not receive any reply after that (and that Karafka maintainer was personally fine with restricting the compatibility to 2.8.8.rc1), I just complied to the change and restricted the integration's minimum version to what it is today (and with that I removed the WaterDrop monitor patch in favour of listening to producer.configured waterdrop events).

So, answering whether we should lower the minimum WaterDrop version requirement: we (as in my workplace) don't have a usecase for that and I am not personally interested on it anymore, but if the DataDog team has an interest on it I don't see a problem (and perhaps they could refer to my original changes if that helps).

I hope that was clear. Apologies if my tone seemed off a bit - that wasn’t my intention. 🙇

Drowze avatar Dec 12 '25 18:12 Drowze

I hope that was clear. Apologies if my tone seemed off a bit - that wasn’t my intention. 🙇

Thank you for the detailed reply and so much information. I apologize as I think my question was a bit unclear and maybe I didn't fully understand the PR when I posed it.

This particular change, I might have have misunderstood it. The description made it seem like Karafka instrumentation doesn't work (especially with Waterdrop), but I made a test app and saw it work (or at least for DSM which was the part I was concerned with). In that test, the waterdrop version was 2.8.8. So I interpreted this PR as making Waterdrop work with 2.6+ since my understanding was that Waterdrop instrumentation worked. However, reading your comment and then re-reading the description and code more carefully, I now understand why my test worked and what this PR does and doesn't do. I think the previous decision to restrict to recent versions of Waterdrop is fine

ericfirth avatar Dec 12 '25 19:12 ericfirth