aws-xray-sdk-go icon indicating copy to clipboard operation
aws-xray-sdk-go copied to clipboard

Data races on http.DefaultTransport with rc15

Open dcu opened this issue 5 years ago • 15 comments

WARNING: DATA RACE
Read at 0x0000022e7450 by goroutine 12:
  net/http.(*Client).send()
      /usr/local/Cellar/go/1.14.1/libexec/src/net/http/client.go:176 +0x39d
  net/http.(*Client).do()
      /usr/local/Cellar/go/1.14.1/libexec/src/net/http/client.go:699 +0x2cc
  net/http.(*Client).Do()
      /usr/local/Cellar/go/1.14.1/libexec/src/net/http/client.go:567 +0x7f
  github.com/aws/aws-sdk-go/aws/corehandlers.sendFollowRedirects()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/[email protected]/aws/corehandlers/handlers.go:120 +0x2f
  github.com/aws/aws-sdk-go/aws/corehandlers.glob..func3()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/[email protected]/aws/corehandlers/handlers.go:112 +0xfd
  github.com/aws/aws-sdk-go/aws/request.(*HandlerList).Run()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/[email protected]/aws/request/handlers.go:267 +0xd1
  github.com/aws/aws-sdk-go/aws/request.(*Request).sendRequest()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/[email protected]/aws/request/request.go:578 +0xf0
  github.com/aws/aws-sdk-go/aws/request.(*Request).Send()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/[email protected]/aws/request/request.go:531 +0x1bf
  github.com/aws/aws-sdk-go/service/xray.(*XRay).GetSamplingRules()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/[email protected]/service/xray/api.go:869 +0x5e
  github.com/aws/aws-xray-sdk-go/strategy/sampling.(*proxy).GetSamplingRules()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/[email protected]/strategy/sampling/proxy.go:86 +0x95
  github.com/aws/aws-xray-sdk-go/strategy/sampling.(*CentralizedStrategy).refreshManifest()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/[email protected]/strategy/sampling/centralized.go:264 +0x15c
  github.com/aws/aws-xray-sdk-go/strategy/sampling.(*CentralizedStrategy).startRulePoller.func1()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/[email protected]/strategy/sampling/centralized.go:207 +0x50

Previous write at 0x0000022e7450 by goroutine 19:
  github.com/jarcoal/httpmock.Deactivate()
      /Users/dcuadrado/go/pkg/mod/github.com/jarcoal/[email protected]/transport.go:716 +0xaf
  github.com/jarcoal/httpmock.DeactivateAndReset()
      /Users/dcuadrado/go/pkg/mod/github.com/jarcoal/[email protected]/transport.go:740 +0x2f
...

Goroutine 12 (running) created at:
  github.com/aws/aws-xray-sdk-go/strategy/sampling.(*CentralizedStrategy).startRulePoller()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/[email protected]/strategy/sampling/centralized.go:206 +0x4c
  github.com/aws/aws-xray-sdk-go/strategy/sampling.(*CentralizedStrategy).start()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/[email protected]/strategy/sampling/centralized.go:196 +0x104
  github.com/aws/aws-xray-sdk-go/strategy/sampling.(*CentralizedStrategy).ShouldTrace()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/[email protected]/strategy/sampling/centralized.go:134 +0xcd1
  github.com/aws/aws-xray-sdk-go/xray.BeginSegmentWithSampling()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/[email protected]/xray/segment.go:80 +0x5fe
  github.com/aws/aws-xray-sdk-go/xray.BeginSegment()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/[email protected]/xray/segment.go:60 +0x9c
...

dcu avatar Apr 10 '20 19:04 dcu

Hi @dcu ,

would you mind describing your application setup and provide repro steps ?

bhautikpip avatar Apr 10 '20 19:04 bhautikpip

Looks like data race is not happening within X-Ray SDK but when this goroutine (https://github.com/aws/aws-xray-sdk-go/blob/master/strategy/sampling/centralized.go#L206) tries to make a http request, this codebase (https://github.com/jarcoal/httpmock/blob/v1/transport.go#L716) from httpmock library updates the value of http.DefaultTransport.

bhautikpip avatar Apr 10 '20 20:04 bhautikpip

is there a way to disable xray so I can do that in my tests?

dcu avatar Apr 10 '20 21:04 dcu

Hi @dcu ,

Sorry for the inconvenience. Currently we don't have support for disabling XRay in XRay Go SDK. We have a pending item in our backlog to support this feature. Meanwhile I am looking more into this issue to see if we can provide any work around to avoid this issue.

bhautikpip avatar Apr 10 '20 21:04 bhautikpip

maybe allowing to set my own transport or http client

dcu avatar Apr 14 '20 16:04 dcu

Hi @dcu ,

That's a good idea I will look more into it to see if we can do that. Also, I'm currently working on disabling XRay Go SDK using environment variable (https://github.com/aws/aws-xray-sdk-go/pull/219). Let me know if this will be able to solve your issue. Would you mind trying this change out and provide your feedback ?

bhautikpip avatar Apr 14 '20 16:04 bhautikpip

I just did, setting AWS_XRAY_SDK_DISABLED to TRUE works and doesn't produce a race condition in my tests.

Still I think you should also allow to set the http client since the default one is not great, pool size is too limited and timeouts are not configured by default

dcu avatar Apr 14 '20 21:04 dcu

Hi @dcu ,

That's good to hear. Would you mind providing more details about http client approach you're talking about ? I want to understand more on that so that we can provide best possible solution to avoid this issue.

bhautikpip avatar Apr 14 '20 21:04 bhautikpip

from what I see, XRay is using the default HTTP client to send the data to its API. It'd be better if you expose a way to change that client so I can use my own configured client.

In the offical AWS SDK you can configure this when creating a session.Session by passing a aws.Config but it doesn't seem possible to do that with the xray SDK

dcu avatar Apr 14 '20 22:04 dcu

In your patch, if I disable xray would I still catch panics like this: failed to begin subsegment named 'kinesis': segment cannot be found. ?

dcu avatar Apr 15 '20 20:04 dcu

No, Ideally you shouldn't receive those panics if you have disabled xray. Are you getting those panics ?

bhautikpip avatar Apr 15 '20 20:04 bhautikpip

no I didn't test that. My concern is not catching those in tests but later after deployment

dcu avatar Apr 15 '20 20:04 dcu

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar May 15 '20 21:05 stale[bot]

any news on this issue?

dcu avatar May 17 '20 11:05 dcu

Hi @dcu ,

We still need to prioritize those efforts as you have mentioned (exposing a way to change default client while calling X-Ray API) in comments. I will put this item in our backlog. Meanwhile we have this PR ready to merge in (https://github.com/aws/aws-xray-sdk-go/pull/219). Let me know if you have any concerns about this PR and we can work accordingly.

bhautikpip avatar May 18 '20 17:05 bhautikpip