opentelemetry-erlang icon indicating copy to clipboard operation
opentelemetry-erlang copied to clipboard

Add otel_monitor

Open derekkraan opened this issue 1 year ago • 9 comments

Following up on this PR: https://github.com/open-telemetry/opentelemetry-erlang-contrib/pull/109

This PR adds a new module, otel_monitor. We associate the span with a monitor ref in an ets table, and end the span when we detect that the process has died.

Looking forward to feedback for this one.

(note: I haven't looked at tests yet)

derekkraan avatar Sep 19 '22 14:09 derekkraan

Codecov Report

Base: 73.57% // Head: 72.83% // Decreases project coverage by -0.74% :warning:

Coverage data is based on head (9cbc113) compared to base (d791c5b). Patch coverage: 20.83% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #461      +/-   ##
==========================================
- Coverage   73.57%   72.83%   -0.75%     
==========================================
  Files          53       54       +1     
  Lines        1722     1745      +23     
==========================================
+ Hits         1267     1271       +4     
- Misses        455      474      +19     
Flag Coverage Δ
api 68.77% <ø> (ø)
elixir 18.77% <ø> (ø)
erlang 74.29% <20.83%> (-0.80%) :arrow_down:
exporter 72.87% <ø> (ø)
sdk 77.38% <20.83%> (-1.63%) :arrow_down:
zipkin 51.47% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
apps/opentelemetry/src/otel_monitor.erl 13.63% <13.63%> (ø)
apps/opentelemetry/src/opentelemetry_sup.erl 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

codecov[bot] avatar Sep 19 '22 14:09 codecov[bot]

Just had a thought, maybe this should be part of the sweeper?

tsloughter avatar Sep 19 '22 22:09 tsloughter

My first instinct would be to keep it self-contained and not include it in the sweeper. It's also a different use-case to the sweeper. The usage patterns of the sweeper are also quite different; it does a lot of work sometimes, while the monitor is doing a small amount of work all of the time, and it will also end up in the hot path of people's code, so it should be fast.

Perhaps a cast is also better from the perspective of speed...

derekkraan avatar Sep 20 '22 06:09 derekkraan

@derekkraan What do you think of the alternative of monitoring every span in the process that is monitored?

I also wonder about the api being an option passed to with/start_span. It would still monitor all spans in the process but make it more obvious when a user is looking at the API and with_span/start_span that they can do it.

tsloughter avatar Nov 08 '22 18:11 tsloughter

Re: monitoring every span, sounds good, and it appears this is what everyone wants, but I think then the exception event should only be added to the outermost one? Or the one that has been explicitly monitored? What do you think?

I do think there should also be an option to with/start_span. I was initially treating that as something that could be added later but if there is already agreement on what that should look like (not that hard perhaps, just monitor: true?), then I can certainly look at adding that in this PR.

derekkraan avatar Nov 10 '22 14:11 derekkraan

the exception event should only be added to the outermost one? Or the one that has been explicitly monitored? What do you think?

Yes outermost seems fine.

hkrutzer avatar Jan 12 '23 15:01 hkrutzer

I don't think outermost is a good idea. It would be a bit of a pain to implement. And really I would think the inner most would make the most sense as its where the failure is more likely to have happened?

I think it is better to just mark any active span in a process with the information.

tsloughter avatar Jan 14 '23 00:01 tsloughter

If the innermost is easier I think that's fine too, any trace viewer should handle that fine.

hkrutzer avatar Jan 14 '23 19:01 hkrutzer

My latest https://github.com/open-telemetry/opentelemetry-erlang/pull/602

tsloughter avatar Aug 10 '23 11:08 tsloughter