swift-distributed-tracing icon indicating copy to clipboard operation
swift-distributed-tracing copied to clipboard

Clarify documentation about the interaction of recordError and setStatus

Open czechboy0 opened this issue 10 months ago • 2 comments

Currently the docs aren't prescriptive enough, which can make switching between tracing backends more difficult.

For example, does recordError imply:

  1. An error occurred during the span, and the span should be considered errored as well, regardless of calls to setStatus
  2. An error occurred during the span, and the span should be considered errored as well, unless setStatus was called with OK
  3. An error occurred during the span, but the overall status of the span should not be considered errored (unless setStatus is also called with Error)
  4. An error occurred during the span, and as a library we don't attribute any specific semantic meaning to the corresponding status of the span
  5. Something else?

czechboy0 avatar Mar 06 '25 11:03 czechboy0

Thanks for bringing this up! It's a mixture of 3 and 4. Generally, recorded errors are not related to a span's status, meaning the status has to be manually set and is not automatically derived from whether a span recorded any errors. Setting a status is very much explicit and dependent on the use-case. Here's the OTel semantic convention around setting the span status for HTTP spans for example:

Span Status MUST be left unset if HTTP status code was in the 1xx, 2xx or 3xx ranges, unless there was another error (e.g., network error receiving the response body; or 3xx codes with max redirects exceeded), in which case status MUST be set to Error.

For HTTP status codes in the 4xx range span status MUST be left unset in case of SpanKind.SERVER and SHOULD be set to Error in case of SpanKind.CLIENT.

For HTTP status codes in the 5xx range, as well as any other code the client failed to interpret, span status SHOULD be set to Error.

Don't set the span status description if the reason can be inferred from http.response.status_code.

HTTP request may fail if it was cancelled or an error occurred preventing the client or server from sending/receiving the request/response fully.

When instrumentation detects such errors it SHOULD set span status to Error and SHOULD set the error.type attribute.

https://github.com/open-telemetry/semantic-conventions/blob/main/docs/http/http-spans.md#status

We followed this principle when we designed the span status API for Swift Distributed Tracing. You can also see this in Hummingbird's TracingMiddleware, which only sets the span status in a certain case but always records an error:

// ...
catch {
    if let endpointPath = context.endpointPath {
        span.operationName = endpointPath
    }
    let statusCode = (error as? HTTPResponseError)?.status.code ?? 500
    span.attributes["http.status_code"] = statusCode
    if 500..<600 ~= statusCode {
        span.setStatus(.init(code: .error))
    }
    span.recordError(error)
    span.end()
    throw error
}

https://github.com/hummingbird-project/hummingbird/blob/8c2157a55354d4ebde8671d1e4a9005edc920dc4/Sources/Hummingbird/Middleware/TracingMiddleware.swift#L111

slashmo avatar Mar 06 '25 12:03 slashmo

This is great, thanks @slashmo! Let's use this issue to track adding much of your explanation into the docs.

czechboy0 avatar Mar 06 '25 12:03 czechboy0