Zitadel OpenTelemetry traces resulting in high cardinality
Preflight Checklist
- [X] I could not find a solution in the existing issues, docs, nor discussions
- [X] I have joined the ZITADEL chat
Describe your problem
We setup the config for Zitadel to record OTel traces and OTel metrics on each endpoint https://github.com/zitadel/zitadel/blob/b055d1d9e67587eacce8e34649085bcd3268a055/cmd/defaults.yaml#L11
We've noticed that the cardinality of two specific operations is extremely high
{ZITADEL_DOMAIN}/v2beta/idp_intents/<idp_intent_id>
{ZITADEL_DOMAIN}/v2beta/oidc/auth_requests/<auth_request_id>
This is because these IDs are generated on each login and are causing a lot of issues on the ingesting side. This is further exacerbated by metrics that are computed on these.
e.g.
latency_bucket{operation="{ZITADEL_DOMAIN}/v2beta/oidc/auth_requests/V2_270990429988614816",service_name="ZITADEL",span_kind="SPAN_KIND_SERVER",status_code="STATUS_CODE_UNSET",le="5"} 0
latency_bucket{operation="{ZITADEL_DOMAIN}/v2beta/oidc/auth_requests/V2_270990429988614816",service_name="ZITADEL",span_kind="SPAN_KIND_SERVER",status_code="STATUS_CODE_UNSET",le="10"} 10
...
set
Describe your ideal solution
These operations should record the Auth Request IDs and IDP Intent IDs as Tags on the span rather than be present in the path.
Version
2.53.4
Environment
Self-hosted
Additional Context
Reducing sampling to 0.1 helped a bit to buy us some time :)
@adlerhurst when checking this, we should also remove the host from the name
Currently open questions:
How to map path to route? e.g. that PUT /users/1235 gets PUT /users/{user_id}
How to add context information to span? baggage or label?
How to map path to route? e.g. that PUT /users/1235 gets PUT /users/{user_id}
https://grpc-ecosystem.github.io/grpc-gateway/docs/operations/annotated_context/ might help