opentelemetry-go-contrib icon indicating copy to clipboard operation
opentelemetry-go-contrib copied to clipboard

otelhttptrace: handle missing getconn hook without panic

Open tonistiigi opened this issue 1 year ago • 8 comments

We have many reports that end() gets called without the span being defined in start() and causes a panic.

Ref https://github.com/moby/buildkit/issues/4377 Ref https://github.com/docker/buildx/issues/2232

I have not not fully debugged in what condition this happens in stdlib (I assume some keepalive pool case) but I don't see any other possible explanation for these panic cases.

Note that there are other httptrace hooks that also use .root without validation. I don't atm. have any proof that these could be called without GetConn() being called first as well so didn't add extra validation to these.

tonistiigi avatar Feb 27 '24 06:02 tonistiigi

Codecov Report

Attention: Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.

Project coverage is 64.4%. Comparing base (669ca60) to head (ded1cd2). Report is 1143 commits behind head on main.

Files with missing lines Patch % Lines
...on/net/http/httptrace/otelhttptrace/clienttrace.go 0.0% 1 Missing and 1 partial :warning:
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #5187     +/-   ##
=======================================
- Coverage   64.5%   64.4%   -0.1%     
=======================================
  Files        200     200             
  Lines      12558   12560      +2     
=======================================
- Hits        8103    8101      -2     
- Misses      4219    4221      +2     
- Partials     236     238      +2     
Files with missing lines Coverage Δ
...on/net/http/httptrace/otelhttptrace/clienttrace.go 79.0% <0.0%> (-0.8%) :arrow_down:

... and 1 file with indirect coverage changes

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Feb 27 '24 06:02 codecov[bot]

This change only handles missing span in end() if useSpans is false. It seems like that could also happen with the use of sub spans as well.

Also, I know this is hard to reproduce, and this package doesn't really have any tests at the moment, but would it be possible to write a test?

dmathieu avatar Feb 27 '24 08:02 dmathieu

This change only handles missing span in end() if useSpans is false. It seems like that could also happen with the use of sub spans as well.

What place/variable do you have in mind?

Also, I know this is hard to reproduce, and this package doesn't really have any tests at the moment, but would it be possible to write a test?

These calls are made from stdlib and I don't know the condition when the specific call order appears.

tonistiigi avatar Feb 27 '24 15:02 tonistiigi

These calls are made from stdlib and I don't know the condition when the specific call order appears.

I think the main usage of httptrace.ClientTrace is here: https://github.com/golang/go/blob/master/src/net/http/transport.go

pellared avatar Feb 27 '24 15:02 pellared

We started seeing this as well after introducing the option otelhttptrace.WithoutSubSpans()

The original implementation was taking locks at the beginning on the function:

func (ct *clientTracer) end(hook string, err error, attrs ...kv.KeyValue) { ct.mtx.Lock()

defer ct.mtx.Unlock()

However since the introduction of if !ct.useSpans { the locks are not taken immediately.

It made me wonder, could this have cause some race condition?

alessandrozucca-p avatar Jun 21 '24 15:06 alessandrozucca-p

This will need a changelog entry.

dmathieu avatar Jun 24 '24 07:06 dmathieu

Added changelog

tonistiigi avatar Jun 25 '24 23:06 tonistiigi

Hi, just ran into https://github.com/moby/buildkit/issues/4377 . Any update on this PR? I understand only unit test is missing?

krajorama avatar Jul 30 '24 09:07 krajorama