feat(telemetry): add OpenTelemetry OTLP support with comprehensive observability
Details
Adds comprehensive OpenTelemetry (OTLP) support to authentik for distributed tracing, metrics, and structured logging. This implementation provides enterprise-grade observability across both Python and Go components while maintaining backward compatibility with existing Prometheus and Sentry integrations.
Key features: - OTLP gRPC/HTTP export support with configurable endpoints - Unified configuration across Python Django and Go microservices - Django middleware for automatic request tracing - Go HTTP middleware with otelhttp instrumentation - Adaptive sampling (excludes health checks, higher error rates) - Auto-instrumentation for Django, PostgreSQL, Redis, Celery, and Go HTTP - Custom metrics for flows, policies, authentication events, and LDAP operations - Graceful degradation when OpenTelemetry packages unavailable - Latest OpenTelemetry Go v1.37.0 and Python packages
Configuration:
telemetry:
otlp:
enabled: false
endpoint: "localhost:4317"
protocol: "grpc"
traces_sample_rate: 0.1
service_name: "authentik"
Files added:
Python telemetry:
- authentik/lib/telemetry/ - Core telemetry implementation
- authentik/lib/tests/test_telemetry*.py - Comprehensive test suite (28 tests)
- Extended ConfigLoader with get_float() method
Go telemetry:
- internal/telemetry/provider.go - OpenTelemetry provider with OTLP configuration
- internal/telemetry/middleware.go - HTTP, LDAP, and Auth middleware
- internal/telemetry/telemetry.go - Global interface and convenience functions
- internal/telemetry/telemetry_test.go - Comprehensive Go test suite
- internal/telemetry/example_integration.go - Integration examples
- internal/config/struct.go - Added TelemetryConfig and OTLPConfig
- Updated go.mod with latest OpenTelemetry Go v1.37.0 dependencies
- Updated authentik/lib/default.yml with telemetry configuration
Documentation:
- OTLP_DOCUMENTATION.md - Consolidated implementation guide with Mermaid architecture diagram
closes #12854
Checklist
- [x] Local tests pass (
ak test authentik/) - [x] The code has been formatted (
make lint-fix)
If an API change has been made
- [x] The API schema has been updated (
make gen-build)
If changes to the frontend have been made
- [x] The code has been formatted (
make web)
If applicable
- [x] The documentation has been updated
- [x] The documentation has been formatted (
make docs)
Deploy Preview for authentik-storybook ready!
| Name | Link |
|---|---|
| Latest commit | 370dfaeee9c1b7824c383dfe7673413647880a67 |
| Latest deploy log | https://app.netlify.com/projects/authentik-storybook/deploys/689b527687e26b00081be1fa |
| Deploy Preview | https://deploy-preview-15804--authentik-storybook.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify project configuration.
Deploy Preview for authentik-docs ready!
| Name | Link |
|---|---|
| Latest commit | 370dfaeee9c1b7824c383dfe7673413647880a67 |
| Latest deploy log | https://app.netlify.com/projects/authentik-docs/deploys/689b5276f28be9000817ca65 |
| Deploy Preview | https://deploy-preview-15804--authentik-docs.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify project configuration.
Deploy Preview for authentik-integrations ready!
| Name | Link |
|---|---|
| Latest commit | 370dfaeee9c1b7824c383dfe7673413647880a67 |
| Latest deploy log | https://app.netlify.com/projects/authentik-integrations/deploys/689b5276208a200008f08a9b |
| Deploy Preview | https://deploy-preview-15804--authentik-integrations.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify project configuration.
Any updates on this PR?
we're in the middle of a release, this is on the backlog for now. I'll take a look once we have released.
Ping on this. We are evaluating Authentik and the topic of observability is a question mark: we are on NewRelic so we can't use the Sentry integration that is already available. I think just supporting an open standard in OpenTelemetry makes more sense than trying to natively integrate every observability platform individually.
So thumbs up on this feature from me 👍
I am open (heh) to having a more open monitoring standard instead of just sentry, however I am not too familiar with the OTEL ecosystem, and it feels like this adds a lot of code and is very verbose, but that might just be how OTEL works.
We may consider this in the future
I am open (heh) to having a more open monitoring standard instead of just sentry, however I am not too familiar with the OTEL ecosystem, and it feels like this adds a lot of code and is very verbose, but that might just be how OTEL works.
We may consider this in the future
I think I might be in the intersection of Authentik developers and OTel developers. I don't think there should be an authentik.lib.telemetry package. The OpenTelemetry API itself is already an abstraction layer across different telemetry vendors; there's no advantage to adding yet another abstraction layer on top of it.
In general, library code should use opentelemetry.metrics.get_meter / opentelemetry.trace.get_tracer to construct a Meter or Tracer object on which to report telemetry (and if it makes sense, you could expose the meter_provider/tracer_provider arguments to allow the library's user to attach to a custom MeterProvider/TracerProvider). If OTel is not configured in the process, those functions will return a noop provider object that has very little overhead.
So for library/package code, we should be sprinkling in creation of metrics and trace spans as appropriate. The code in this PR imports directly from opentelemetry.sdk.*, but instrumented code should not be touching opentelemetry.sdk.*. MeterProvider and TracerProvider are available directly as opentelemetry.metrics.MeterProvider and opentelemetry.trace.TracerProvider. (OpenTelemetry calls this the "API" vs the "SDK". The API is lightweight and contains the surface that instrumented code should use to report telemetry and a noop implementation. It's fine and expected for library code to have a required dependency on the OpenTelemetry API. Whereas the OpenTelemetry SDK is a heavyweight implementation of the API; users don't have to use that as their API implementation, and libraries should not depend on the SDK.
As far as the middleware goes, there's already rich support for OpenTelemetry in Django with the opentelemetry-instrumentation-django package. Authentik shouldn't have its own instrumentation for packages that already have supported upstream instrumentation.
So all of that covers how the code should be instrumented, which is separate from any configuration of where that telemetry should be directed. There are two ways that OpenTelemetry wants you to do that. The recommended way is to use something called "zero-code instrumentation" - basically OTel provides a launcher (and a k8s operator) that will inject user-controlled telemetry configuration, without the application ever needing to directly interact with the SDK at all. Unfortunately, the Go support is much less complete. They're working on standardizing on a single config file format across all languages, but that's currently experimental. The alternative is to manually construct and install an SDK object with the desired configuration, and that should only happen in a main() function or equivalent. Unfortunately, that limits the user to only using a single API implementation, and only configuring it in the ways that you chose to expose (for example, you might offer a way to configure a Jaeger exporter, but if a user wants to use Zipkin they're SOL).