opentelemetry-python-contrib icon indicating copy to clipboard operation
opentelemetry-python-contrib copied to clipboard

Request URLs with unicode chars crash WSGI instrumentation

Open kujenga opened this issue 9 months ago • 0 comments
trafficstars

Describe your environment

OS: macos Python version: 3.11.10 Package version: 0.50b0

What happened?

I have a django application that uses DjangoInstrumentor().instrument() to add instrumentation.

If you have a URL like: http://localhost:8000/%F0%9F%98%84 in a django app with this instrumentation installed, it crashes the app. No such crash occurs when the instrumentation is not enabled.

Steps to Reproduce

Setup with:

uv init
uv add django opentelemetry-sdk opentelemetry-instrumentation-django
uv run django-admin startproject crash_repro
cd crash_repro

Replace manage.py with:

#!/usr/bin/env python
"""Django's command-line utility for administrative tasks."""
import os
import sys

from opentelemetry import trace
from opentelemetry.sdk.resources import SERVICE_NAME, Resource
from opentelemetry.sdk.trace import TracerProvider as SDKTracerProvider
from opentelemetry.sdk.trace.export import (
    BatchSpanProcessor, ConsoleSpanExporter,
)


def initialize_otel_provider():
    resource = Resource(attributes={SERVICE_NAME: "crash_repro"})
    tracer_provider = SDKTracerProvider(resource=resource)

    exporter = ConsoleSpanExporter()
    span_processor = BatchSpanProcessor(exporter)
    tracer_provider.add_span_processor(span_processor)

    trace.set_tracer_provider(tracer_provider)
    return tracer_provider

def main():
    """Run administrative tasks."""
    os.environ.setdefault("DJANGO_SETTINGS_MODULE", "crash_repro.settings")

    initialize_otel_provider()

    try:
        from django.core.management import execute_from_command_line
    except ImportError as exc:
        raise ImportError(
            "Couldn't import Django. Are you sure it's installed and "
            "available on your PYTHONPATH environment variable? Did you "
            "forget to activate a virtual environment?"
        ) from exc
    execute_from_command_line(sys.argv)


if __name__ == "__main__":
    main()

Replace urls.py with:

from django.http import HttpResponse
from django.urls import path
from opentelemetry.instrumentation.django import DjangoInstrumentor

DjangoInstrumentor().instrument()


def handler(request):
    return HttpResponse("Hello, World!")


urlpatterns = [
    path("", handler, name="handler"),
]

Run: uv run python manage.py runserver 8005

Go to http://localhost:8005/%F0%9F%98%84 and observe the crash

Expected Result

No 500 error should occur, URL-encoded unicode characters in the URL should not cause a crash.

Actual Result

Watching for file changes with StatReloader
Performing system checks...

System check identified no issues (0 silenced).

Run 'python manage.py migrate' to apply them.
February 04, 2025 - 01:55:11
Django version 5.1.5, using settings 'crash_repro.settings'
Starting development server at http://127.0.0.1:8005/
Quit the server with CONTROL-C.

Internal Server Error: /😄
Traceback (most recent call last):
  File "/Users/aaron/Developer/tmp/django-instrumentation-crash-repro-2/.venv/lib/python3.11/site-packages/django/core/handlers/exception.py", line 55, in inner
    response = get_response(request)
               ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aaron/Developer/tmp/django-instrumentation-crash-repro-2/.venv/lib/python3.11/site-packages/opentelemetry/instrumentation/django/middleware/otel_middleware.py", line 91, in __call__
    self.process_request(request)
  File "/Users/aaron/Developer/tmp/django-instrumentation-crash-repro-2/.venv/lib/python3.11/site-packages/opentelemetry/instrumentation/django/middleware/otel_middleware.py", line 217, in process_request
    attributes = collect_request_attributes(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aaron/Developer/tmp/django-instrumentation-crash-repro-2/.venv/lib/python3.11/site-packages/opentelemetry/instrumentation/wsgi/__init__.py", line 358, in collect_request_attributes
    wsgiref_util.request_uri(environ)
  File "/Users/aaron/.local/share/uv/python/cpython-3.11.10-macos-aarch64-none/lib/python3.11/wsgiref/util.py", line 61, in request_uri
    path_info = quote(environ.get('PATH_INFO',''), safe='/;=,', encoding='latin1')
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/aaron/.local/share/uv/python/cpython-3.11.10-macos-aarch64-none/lib/python3.11/urllib/parse.py", line 893, in quote
    string = string.encode(encoding, errors)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'latin-1' codec can't encode character '\U0001f604' in position 1: ordinal not in range(256)
[04/Feb/2025 01:55:14] "GET /%F0%9F%98%84 HTTP/1.1" 500 101858

Additional context

No response

Would you like to implement a fix?

None

kujenga avatar Feb 04 '25 01:02 kujenga