dd-trace-rb icon indicating copy to clipboard operation
dd-trace-rb copied to clipboard

[fix] rack middleware close trace via body proxy (Rack::CommonLogger compatibility)

Open skcc321 opened this issue 3 years ago • 9 comments

What is the issue?

Hanami framework is Rack-based framework. It uses Rack::CommonLogger middleware for printing out request/response logs. The issue is, current span/trace is not associated with these logs because Datadog rack middleware closes trace too early. If we take a look at Rack::CommonLogger we can observe that Rack::CommonLogger uses Rack::BodyProxy to guarantee that log is printed only when everything else is done (datadog trace as well) as a result we don't have an active correlation in logs formatter. So I implemented the same approach for closing current span/trace in Datadog rack middleware.

skcc321 avatar Nov 01 '21 11:11 skcc321

Is there any way to push the Datadog middleware to the top of the stack? Or does BodyProxy truly have to run first/last? (Before/after ddtrace middleware?) If there's a way to do this, I think that would be the simplest solution.

You mentioned this is a Hanami app... one possibility: if necessary, we could introduce a Hanami integration that is aware of its middleware stack and can manage this. (Similar to what we do for Rails already.)

Let me know if any of that would work, or if I'm missing some detail.

delner avatar Nov 02 '21 04:11 delner

@delner hanami is just a rack based framework. So the issue is related to Rack::CommonLogger which is something which exists many years. Rack::CommonLogger is a part of rack library so technically datadog "rack" instrumentation is not complete without handling that CommonLogger properly.

skcc321 avatar Nov 02 '21 08:11 skcc321

@marcotc so as I understand the solution for rack middleware is: Insert datadog rack middleware on top of all middlewares, so trace/span is started for sure at the very beginning of a request processing and closed at the very end. So according to behaviour of Rack::BodyProxy I think the implementation is OK (if we are sure that datadog rack middleware was inserted at the very beginning)

If the Body responds to +close+, it will be called after iteration. If the original Body is replaced by a new Body, the new Body must close the original Body after iteration, if it responds to +close+. If the Body responds to both +to_ary+ and +close+, its implementation of +to_ary+ must call +close+ after iteration.

Doesn't that guarantee that body will be closed always?

skcc321 avatar Nov 02 '21 08:11 skcc321

could you guys take a look at the problem and solution at hand here and see if you have any options or suggestions for solving the Rack::CommonLogger problem at hand?

I haven't dug into this that often, so take my opinions/suggestions with a big grain of salt, but looking at Rack::CommonLogger it seems to gather a number of infos from the app when it executes as part of the rack stack, and then uses the BodyProxy as a way of printing that information some time later in the future:

https://github.com/rack/rack/blob/e0993d47096d36535dbe725091963a7e0f5aab52/lib/rack/common_logger.rb#L36-L42

But that gets me thinking; isn't what we're trying to do here similar to what the rack common logger is doing? E.g. could we gather the current span information inside CommonLogger#call, and store it for later, so that the logger could access it after-the-fact?

ivoanjo avatar Nov 02 '21 09:11 ivoanjo

so technically datadog "rack" instrumentation is not complete without handling that CommonLogger properly.

@skcc321 Right, so why couldn't we push the datadog middleware to the the last middleware?

delner avatar Nov 02 '21 21:11 delner

so technically datadog "rack" instrumentation is not complete without handling that CommonLogger properly.

@skcc321 Right, so why couldn't we push the datadog middleware to the the last middleware?

@delner the reason is that Rack::BodyProxy will write logs as a "callback" so it always goes last regardless Datadog middleware on top of other middlewares. As a result, you have no active span/trace in Rack::CommonLogger#call

skcc321 avatar Nov 03 '21 10:11 skcc321

Codecov Report

Merging #1746 (513c157) into master (8b97105) will increase coverage by 0.00%. The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1746   +/-   ##
=======================================
  Coverage   98.17%   98.17%           
=======================================
  Files         934      934           
  Lines       45048    45056    +8     
=======================================
+ Hits        44227    44235    +8     
  Misses        821      821           
Impacted Files Coverage Δ
lib/ddtrace/contrib/delayed_job/plugin.rb 100.00% <100.00%> (ø)
lib/ddtrace/contrib/rack/middlewares.rb 85.71% <100.00%> (+0.89%) :arrow_up:
lib/ddtrace/contrib/rack/patcher.rb 93.75% <100.00%> (+0.13%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 8b97105...513c157. Read the comment docs.

codecov-commenter avatar Nov 19 '21 12:11 codecov-commenter

@delner @ivoanjo any updates on this one?

skcc321 avatar Nov 19 '21 12:11 skcc321

Thanks for the ping, I'll raise this with the team!

ivoanjo avatar Nov 19 '21 12:11 ivoanjo