opentelemetry-python icon indicating copy to clipboard operation
opentelemetry-python copied to clipboard

Duplicate Span, sometimes

Open Liubey opened this issue 2 years ago • 14 comments

Steps to reproduce When I use opentelemetry in python3, sometimes I got duplicate span . these code used to create tracer:

        from opentelemetry import trace, version
        from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
        from opentelemetry.sdk.resources import Resource
        from opentelemetry.sdk.trace import TracerProvider
        from opentelemetry.sdk.trace.export import BatchSpanProcessor, SimpleSpanProcessor, ConsoleSpanExporter
        from opentelemetry.propagate import set_global_textmap
        from opentelemetry.propagators.b3 import B3Format
        set_global_textmap(B3Format())
        os.environ.setdefault(u'OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT', u'10240')
        resource = Resource(attributes={
            PROCESS_IP_TAG_KEY: local_ip(),
            PROCESS_HOSTNAME_TAG_KEY: host_name(),
            PROCESS_SERVICE_NAME_TAG_KEY: service_name,
            PROCESS_SDK_VERSION_TAG_KEY: u'Opentelemetry-Python-' + version.__version__
        })
        # https://opentelemetry.io/docs/instrumentation/python/exporters/
        trace.set_tracer_provider(TracerProvider(resource=resource))
        otlp_exporter = OTLPSpanExporter(endpoint=u'http://localhost:6831', insecure=True)
        span_processor = BatchSpanProcessor(otlp_exporter)
        trace.get_tracer_provider().add_span_processor(span_processor)
        if is_tracing_print_log():
            trace.get_tracer_provider().add_span_processor(SimpleSpanProcessor(ConsoleSpanExporter()))

        tracer_implements = trace.get_tracer(__name__)

this code used to create span:

        from opentelemetry import trace, baggage
        exist_active_span = trace.get_current_span()
        span = self.tracer_implements.start_span(name, context=parent_span_context)
        span.set_attribute(TAG_PRODUCT_FLAG, product_flag)
        scope = trace.use_span(span, end_on_exit=True)
        setattr(scope, u'__span__', span)
        scope.__enter__()
        return scope

this code used to end span:

        current_scope.__exit__(None, None, None)

Something that needs to be said IT IS NOT 100%, when I restart my server It may be disappeared. But Something when I restart my server again, It may be appear. I swear It's duplicate by report step. cause console print twice already. In Jaeger It looks like this: image

What is the expected behavior? What happened?

What is the actual behavior?

Additional context

Liubey avatar Apr 21 '23 06:04 Liubey

Is there a way for you to check if the span IDs are duplicated?

ocelotl avatar Jun 06 '23 14:06 ocelotl

I can confirm that span and traces id are duplicated. I'm testing this connector sending traces from a Django application that use last otel instrumentation library (0.39) and I receive this:

2023-07-18T12:18:01.543Z	debug	exceptionsconnector/connector_metrics.go:99checking trace/span	{"kind": "exporter", "name": "exceptions", "exporter_in_pipeline": "traces", "receiver_in_pipeline": "metrics", "trace_id": "2bc89f693a428f16e44f9a1e63e3bc50", "span_id": "5a507db467f88147"}
2023-07-18T12:18:01.545Z	debug	exceptionsconnector/connector_metrics.go:99checking trace/span	{"kind": "exporter", "name": "exceptions", "exporter_in_pipeline": "traces", "receiver_in_pipeline": "metrics", "trace_id": "2bc89f693a428f16e44f9a1e63e3bc50", "span_id": "5a507db467f88147"}

With previous versions (last time i tried it was 0.26) I didn't have this issue.

marctc avatar Jul 18 '23 12:07 marctc

Doing some testing I found that opentelemetry v1.12.0rc2 & v0.32b0 was the first release that includes this bug.

marctc avatar Jul 18 '23 12:07 marctc

@Liubey @marctc There has been a few releases since this issue was created. Can you confirm whether you are still getting duplicate spans?

lzchen avatar Sep 25 '23 17:09 lzchen

Hey. I'm also getting a similar issue. I'm seeing duplicate spans.

devdut1999 avatar Oct 26 '23 13:10 devdut1999

This seems really hard to reproduce and the culprit could be in several different components. If someone could provide a full repro with a docker compose yaml setup, and pinned python dependencies, we could definitely investigate.

aabmass avatar Oct 26 '23 16:10 aabmass

@Liubey @marctc There has been a few releases since this issue was created. Can you confirm whether you are still getting duplicate spans?

I can confirm the problem still persists :shrug:

marctc avatar Oct 27 '23 16:10 marctc

This seems really hard to reproduce and the culprit could be in several different components. If someone could provide a full repro with a docker compose yaml setup, and pinned python dependencies, we could definitely investigate.

this is my docker-compose:

version: "2"
services:
  api:
    depends_on:
      - otel-collector
    build: ./service/api/.
    command: opentelemetry-instrument uwsgi --http :8000 --module faulty.wsgi --master
    ports:
      - "8000:8000"
    environment:
      - DJANGO_SETTINGS_MODULE=api.settings
      - OTEL_RESOURCE_ATTRIBUTES=service.name=api
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
      - PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python
      - OTEL_EXPORTER_OTLP_INSECURE=true

  otel-collector:
    build: ./otel-collector/.
    ports:
      - "6831:6831"
      - "14268:14268"
      - "4317:4317"
      - "4318:4318"
    volumes:
      - ./config/otel-collector.yaml:/config/otel-collector.yaml
    command: /bin/otelcol --config=/config/otel-collector.yaml

wsgi.py content:

import os

from django.core.wsgi import get_wsgi_application
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor, ConsoleSpanExporter
from uwsgidecorators import postfork

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'faulty.settings')

application = get_wsgi_application()


@postfork
def init_tracing():
    trace.set_tracer_provider(TracerProvider())
    trace.get_tracer_provider().add_span_processor(BatchSpanProcessor(
        OTLPSpanExporter(endpoint=os.environ.get('OTEL_EXPORTER_OTLP_ENDPOINT'))
    ))
    trace.get_tracer_provider().add_span_processor(BatchSpanProcessor(
        ConsoleSpanExporter()
    ))

requirements.txt content

opentelemetry-exporter-otlp==1.20.0
opentelemetry-instrumentation-django==v0.41b0
opentelemetry-distro==v0.41b0
Django==4.1
uWSGI==2.0.21

marctc avatar Oct 27 '23 16:10 marctc

@marctc this example is missing Dockerfiles and django app? A github gist or repo I can clone and just run docker compose up would be easiest.

aabmass avatar Oct 31 '23 17:10 aabmass

@marctc this example is missing Dockerfiles and django app? A github gist or repo I can clone and just run docker compose up would be easiest.

there you go https://github.com/marctc/django-otel-demo

marctc avatar Nov 01 '23 09:11 marctc

@marctc you are adding two OTLPSpanExporters to the TracerProvider which is causing duplicates. I added this to your wsgi.py

diff --git a/service/faulty/faulty/wsgi.py b/service/faulty/faulty/wsgi.py
index 82eacb9..997d833 100644
--- a/service/faulty/faulty/wsgi.py
+++ b/service/faulty/faulty/wsgi.py
@@ -31,3 +31,7 @@ def init_tracing():
     trace.get_tracer_provider().add_span_processor(BatchSpanProcessor(
         ConsoleSpanExporter()
     ))
+
+    print(
+        [sp.span_exporter for sp in trace.get_tracer_provider()._active_span_processor._span_processors]
+    )

and it prints out

[<opentelemetry.exporter.otlp.proto.grpc.trace_exporter.OTLPSpanExporter object at 0x7fba8dc19150>, <opentelemetry.exporter.otlp.proto.grpc.trace_exporter.OTLPSpanExporter object at 0x7fba74263e50>, <opentelemetry.sdk.trace.export.ConsoleSpanExporter object at 0x7fba744733d0>]

If you look at the Python startup logs, you'll see the warning Overriding of current TracerProvider is not allowed. What's happening is opentelemetry-instrument sets up the global tracer provider for you already with an OTLP exporter (the default). Then in wsgi.py you're adding trace.set_tracer_provider(TracerProvider()) has no effect (because you can't override the global once set) and you end up just adding more exporters to the existing TracerProvider.

You should only use one of these two options for setting up your SDK:

  1. Use opentelemetry-instrument with appropriate envvars or command line flags
  2. Create a TracerProvider in your code and set it as global with opentelemetry.trace.set_tracer_provider()

aabmass avatar Nov 01 '23 18:11 aabmass

@aabmass But I NEVER use opentelemetry-instrument. And It's difficulty to reproduce. But I will notice the log 'Overriding of current TracerProvider is not allowed'

Liubey avatar Nov 09 '23 02:11 Liubey

The repro you provided is using it https://github.com/marctc/django-otel-demo/blob/9a3e1d614fda963578ff752db524283562520252/docker-compose.yaml#L7

aabmass avatar Nov 09 '23 14:11 aabmass

We are Observing the same Issue, sometimes it is 20x, 8x , 2x duplication. Its pretty random and ruining the traces. Collector Version: 0.100.0

bazingarj avatar Aug 07 '24 01:08 bazingarj