opentelemetry-python-contrib icon indicating copy to clipboard operation
opentelemetry-python-contrib copied to clipboard

opentelemetry-instrumentation-asgi: root_path is handled not compliant to asgi spec

Open dhofstetter opened this issue 1 year ago • 0 comments
trafficstars

There is a problem with handling root_path scope in asgi applications with opentelemetry-instrumentation-asgi. As soon as you have a setup where root_paths come into account, the attribute http.url of the server span is not built correctly anymore.

I have prepared some example to show some variations when this might occur. I did some deeper investigation to figure out the correct source of reason (e.g. to ensure that not fastapi or starlette - the ones I used to create the examples - are the source of the error because of a wrong handling there). The conclusion is that according to asgi spec, the asgi instrumentation handles the root_path scope not as defined (see last section with proposed fix). Finally I prepared a Pull Request that fixes that issue including unit tests to ensure that this kind of error can be detected in case of reappearance with future changes.

Describe your environment

Python

  • Python 3.9.14

Packages

Package                                  Version
---------------------------------------- --------
annotated-types                          0.6.0
anyio                                    4.3.0
asgiref                                  3.8.1
certifi                                  2024.2.2
charset-normalizer                       3.3.2
click                                    8.1.7
deprecated                               1.2.14
exceptiongroup                           1.2.1
fastapi                                  0.110.2
googleapis-common-protos                 1.63.0
grpcio                                   1.62.2
h11                                      0.14.0
idna                                     3.7
importlib-metadata                       7.0.0
opentelemetry-api                        1.24.0
opentelemetry-distro                     0.45b0
opentelemetry-exporter-otlp              1.24.0
opentelemetry-exporter-otlp-proto-common 1.24.0
opentelemetry-exporter-otlp-proto-grpc   1.24.0
opentelemetry-exporter-otlp-proto-http   1.24.0
opentelemetry-instrumentation            0.45b0
opentelemetry-instrumentation-asgi       0.45b0
opentelemetry-instrumentation-asyncio    0.45b0
opentelemetry-instrumentation-fastapi    0.45b0
opentelemetry-instrumentation-starlette  0.45b0
opentelemetry-proto                      1.24.0
opentelemetry-sdk                        1.24.0
opentelemetry-semantic-conventions       0.45b0
opentelemetry-test-utils                 0.45b0
opentelemetry-util-http                  0.45b0
protobuf                                 4.25.3
pydantic                                 2.7.1
pydantic-core                            2.18.2
requests                                 2.31.0
setuptools                               69.5.1
sniffio                                  1.3.1
starlette                                0.37.2
typing-extensions                        4.11.0
urllib3                                  2.2.1
uvicorn                                  0.29.0
wrapt                                    1.16.0
zipp                                     3.18.1
  • The problem still exists on main
  • A pr fixing the issue including unit tests is on the way

Steps to reproduce

The latter definitions of a fastapi and a starlette application are used to create the spans as included. To visualize the rendering problem occurring in asgi instrumentation.

FastApi (subapp_fastapi.py)

import asyncio

import fastapi
from starlette.responses import PlainTextResponse
import logging
import uvicorn
from opentelemetry.instrumentation.starlette import StarletteInstrumentor

logging.basicConfig(level=logging.DEBUG)
logging.getLogger().setLevel(logging.DEBUG)
logging.getLogger("opentelemetry.exporter").setLevel(logging.DEBUG)

sub_app = fastapi.FastAPI()

@sub_app.get("/home")
def sub_home():
    return PlainTextResponse("sub hi")

app = fastapi.FastAPI(root_path="/root")

app.mount("/sub", app=sub_app)

Starlette (subapp_starlette.py)

import asyncio

import starlette.applications as applications
from starlette.responses import PlainTextResponse
from starlette.routing import Route, Mount
import logging
from opentelemetry.instrumentation.starlette import StarletteInstrumentor

logging.basicConfig(level=logging.DEBUG)
logging.getLogger().setLevel(logging.DEBUG)
logging.getLogger("opentelemetry.exporter").setLevel(logging.DEBUG)

def sub_home(_):
    return PlainTextResponse("sub hi")


sub_app = applications.Starlette(routes=[Route("/home", sub_home)])

app = applications.Starlette(
    routes=[
        Mount("/sub", app=sub_app),
    ],
)
# I required the starlette app to be instrumented manually, well nevertheless this is not the point here
StarletteInstrumentor().instrument_app(app)

Start the apps and issue a curl

# startup fastapi
opentelemetry-instrument \
    --traces_exporter console,otlp \
    --metrics_exporter none \
    --service_name your-service-name \
    --exporter_otlp_endpoint 0.0.0.0:4317 \
    uvicorn "subapp_fastapi:app"

# issue a curl
# as we set root_path in python app in this example, the api has to be reached including root-path in this example, a modified
# version as with the starlette approach can be made as well. That changes the curl, but not the span generated by asgi instrumentation
curl http://127.0.0.1:8000/root/sub/home 

# startup starlette
opentelemetry-instrument \
    --traces_exporter console,otlp \
    --metrics_exporter none \
    --service_name your-service-name \
    --exporter_otlp_endpoint 0.0.0.0:4317 \
    uvicorn --root-path="/root" "subapp_starlette:app"
# issue a curl (be aware that for starlette, even tough the root-path is set by uvicorn, the app is reachable without /root, which 
# is fine as a proxy would have called the app that way as well
curl http://127.0.0.1:8000/sub/home 

FastApi Span

{
    "name": "GET /home",
    "context": {
        "trace_id": "0x1959bfcadb5cac6861677bc4d0fc2b99",
        "span_id": "0xee85071f511ce3e0",
        "trace_state": "[]"
    },
    "kind": "SpanKind.INTERNAL",
    "parent_id": "0x589c256374e8e0f9",
    "start_time": "2024-04-29T19:59:47.893212Z",
    "end_time": "2024-04-29T19:59:47.898762Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "http.scheme": "http",
        "http.host": "127.0.0.1:8000",
        "net.host.port": 8000,
        "http.flavor": "1.1",
        "http.target": "/root/sub/home",
        "http.url": "http://127.0.0.1:8000/root/sub/root/sub/home",
        "http.method": "GET",
        "http.server_name": "127.0.0.1:8000",
        "http.user_agent": "curl/8.4.0",
        "net.peer.ip": "127.0.0.1",
        "net.peer.port": 57323,
        "http.route": "/home",
        "http.status_code": 200
    },
    "events": [],
    "links": [],
    "resource": {
        "attributes": {
            "telemetry.sdk.language": "python",
            "telemetry.sdk.name": "opentelemetry",
            "telemetry.sdk.version": "1.24.0",
            "service.name": "your-service-name",
            "telemetry.auto.version": "0.45b0"
        },
        "schema_url": ""
    }
}

Starlette Span

{
    "name": "GET /sub",
    "context": {
        "trace_id": "0xf2db1f245924b5495b93b66ce03cf77e",
        "span_id": "0x97c2381a665d186c",
        "trace_state": "[]"
    },
    "kind": "SpanKind.SERVER",
    "parent_id": null,
    "start_time": "2024-04-29T20:03:22.887488Z",
    "end_time": "2024-04-29T20:03:22.888936Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "http.scheme": "http",
        "http.host": "127.0.0.1:8000",
        "net.host.port": 8000,
        "http.flavor": "1.1",
        "http.target": "/root/sub/home",
        "http.url": "http://127.0.0.1:8000/root/root/sub/home",
        "http.method": "GET",
        "http.server_name": "127.0.0.1:8000",
        "http.user_agent": "curl/8.4.0",
        "net.peer.ip": "127.0.0.1",
        "net.peer.port": 57368,
        "http.route": "/sub",
        "http.status_code": 200
    },
    "events": [],
    "links": [],
    "resource": {
        "attributes": {
            "telemetry.sdk.language": "python",
            "telemetry.sdk.name": "opentelemetry",
            "telemetry.sdk.version": "1.24.0",
            "service.name": "your-service-name",
            "telemetry.auto.version": "0.45b0"
        },
        "schema_url": ""
    }
}

What is the expected behavior?

The expectation is that for the attributes of the SpanKind.SERVER span, we get correct results for

  • http.target -> is already as expected
  • http.url -> the url built is
    • fastapi: http://127.0.0.1:8000/root/sub/home
    • starlette: http://127.0.0.1:8000/root/sub/home

What is the actual behavior?

  • http.url -> the url built is
    • fastapi: http://127.0.0.1:8000/root/sub/root/sub/home
    • starlette: http://127.0.0.1:8000/root/root/sub/home

Additional context

The problem is related to the handling of the scope values in get_host_port_url_tuple (see instrumentation/opentelemetry-instrumentation-asgi/src/opentelemetry/instrumentation/asgi/__init__.py:368)

My proposal for fixing the issue (including the asgi spec sections that might specify the correct interpretation of the scopes is

def get_host_port_url_tuple(scope):
    """Returns (host, port, full_url) tuple."""
    server = scope.get("server") or ["0.0.0.0", 80]
    port = server[1]
    server_host = server[0] + (":" + str(port) if str(port) != "80" else "")
    # To get the correct virtual url path within the hosting application (e.g also in a subapplication scenario)
    # we have to remove the root_path from the path
    # see:
    # - https://asgi.readthedocs.io/en/latest/specs/www.html#http-connection-scope (see: root_path and path)
    # - https://asgi.readthedocs.io/en/latest/specs/www.html#wsgi-compatibility (see: PATH_INFO)
    #       PATH_INFO can be derived by stripping root_path from path
    #       -> that means that the path should contain the root_path already, so prefixing it again is not necessary
    # - https://wsgi.readthedocs.io/en/latest/definitions.html#envvar-PATH_INFO
    #
    # From investigation it seems (that at least for fastapi), the path is already correctly set. That means
    # that root_path is already included in the path, so we can use it directly for full path.
    # old way: full_path = scope.get("root_path", "") + scope.get("path", "")
    full_path = scope.get("path", "")
    http_url = scope.get("scheme", "http") + "://" + server_host + full_path
    return server_host, port, http_url

dhofstetter avatar Apr 29 '24 20:04 dhofstetter