opentelemetry-operations-python
opentelemetry-operations-python copied to clipboard
CloudTraceSpanExporter+SimpleSpanProcessor+RequestsInstrumentor Infinite Loop
If you enable RequestsInstrumentor while using CloudTraceSpanExporter with SimpleSpanProcessor, you get an infinite loop on the first span exported.
When running locally, I see the following lines repeated with log level debug:
I0206 10:03:10.286911 140694698043136 connectionpool.py:460] https://oauth2.googleapis.com:443 "POST /token HTTP/1.1" 200 None I0206 10:03:10.290570 140694689650432 requests.py:192] Making request: POST https://oauth2.googleapis.com/token
When running in cloud run, I see the following log entry repeated:
http://metadata.google.internal:80 "GET /computeMetadata/v1/instance/service-accounts/default/?recursive=true HTTP/1.1" 200 615
Because of this I assume the issue is that a requests call is made during cloud trace export, which results in a span being recorded by the instrumentor and thus a new call to cloud trace export before the first one has complete auth set-up, resulting in new auth set-up and so on.
This blog post appears to document the same issue: https://minherz.medium.com/today-i-learned-why-using-opentelemetry-requestsinstrumentor-can-freeze-your-application-ae09410b016d
Because Cloud Run requires SimpleSpanProcessor, this essentially means RequestInstrumentor cannot be used in that environment.
Because of this I assume the issue is that a requests call is made during cloud trace export, which results in a span being recorded by the instrumentor and thus a new call to cloud trace export before the first one has complete auth set-up, resulting in new auth set-up and so on.
:+1: that's sounds correct and I've seen this issue as well. There is a suppress_instrumentation context key that is supposed to fix this looping problem but IIRC it gets thrown away at some point.
Because Cloud Run requires SimpleSpanProcessor, this essentially means RequestInstrumentor cannot be used in that environment.
Where do you see that Cloud Run requires SimpleSpanProcessor? I would actually recommend never using SimpleSpanProcessor and always using BatchSpanProcessor. Just make sure you call TracerProvider.shutdown() in a SIGTERM handler as described in Cloud Run documentation to flush any buffered telemetry.
Where do you see that Cloud Run requires SimpleSpanProcessor
https://cloud.google.com/trace/docs/setup/python-ot says: "To send spans with a foreground process, use the SimpleSpanProcessor processor. If you are using Cloud Run, then you must use this processor"
I haven't actually tested using BatchSpanProcessor on Cloud Run, though just read that note in the doc.
Thank you for pointing this out! Let me see if we can get the docs sorted out