opentelemetry-operations-python icon indicating copy to clipboard operation
opentelemetry-operations-python copied to clipboard

CloudTraceSpanExporter+SimpleSpanProcessor+RequestsInstrumentor Infinite Loop

Open davidn opened this issue 2 years ago • 3 comments

If you enable RequestsInstrumentor while using CloudTraceSpanExporter with SimpleSpanProcessor, you get an infinite loop on the first span exported.

When running locally, I see the following lines repeated with log level debug:

I0206 10:03:10.286911 140694698043136 connectionpool.py:460] https://oauth2.googleapis.com:443 "POST /token HTTP/1.1" 200 None I0206 10:03:10.290570 140694689650432 requests.py:192] Making request: POST https://oauth2.googleapis.com/token

When running in cloud run, I see the following log entry repeated:

http://metadata.google.internal:80 "GET /computeMetadata/v1/instance/service-accounts/default/?recursive=true HTTP/1.1" 200 615

Because of this I assume the issue is that a requests call is made during cloud trace export, which results in a span being recorded by the instrumentor and thus a new call to cloud trace export before the first one has complete auth set-up, resulting in new auth set-up and so on.

This blog post appears to document the same issue: https://minherz.medium.com/today-i-learned-why-using-opentelemetry-requestsinstrumentor-can-freeze-your-application-ae09410b016d

Because Cloud Run requires SimpleSpanProcessor, this essentially means RequestInstrumentor cannot be used in that environment.

davidn avatar Feb 06 '23 20:02 davidn

Because of this I assume the issue is that a requests call is made during cloud trace export, which results in a span being recorded by the instrumentor and thus a new call to cloud trace export before the first one has complete auth set-up, resulting in new auth set-up and so on.

:+1: that's sounds correct and I've seen this issue as well. There is a suppress_instrumentation context key that is supposed to fix this looping problem but IIRC it gets thrown away at some point.

Because Cloud Run requires SimpleSpanProcessor, this essentially means RequestInstrumentor cannot be used in that environment.

Where do you see that Cloud Run requires SimpleSpanProcessor? I would actually recommend never using SimpleSpanProcessor and always using BatchSpanProcessor. Just make sure you call TracerProvider.shutdown() in a SIGTERM handler as described in Cloud Run documentation to flush any buffered telemetry.

aabmass avatar Feb 13 '23 20:02 aabmass

Where do you see that Cloud Run requires SimpleSpanProcessor

https://cloud.google.com/trace/docs/setup/python-ot says: "To send spans with a foreground process, use the SimpleSpanProcessor processor. If you are using Cloud Run, then you must use this processor"

I haven't actually tested using BatchSpanProcessor on Cloud Run, though just read that note in the doc.

davidn avatar Feb 13 '23 20:02 davidn

Thank you for pointing this out! Let me see if we can get the docs sorted out

aabmass avatar Mar 06 '23 20:03 aabmass