traefik icon indicating copy to clipboard operation
traefik copied to clipboard

Prometheus: More detailed metrics

Open danielh1989 opened this issue 7 years ago • 7 comments

Do you want to request a feature or report a bug?

Feature

What did you expect to see?

I like to have a more detailed metric with prometeus. For example:

  • Requests per Route
  • Requests per Backend
  • Average Time per Backend
  • Average Time per Route

danielh1989 avatar Apr 17 '18 08:04 danielh1989

Hi @danielh1989,

Thanks for your interest in Træfik projet.

For information, the current available metrics are :

  • config_reloads_total
  • traefik_config_reloads_failure_total
  • traefik_config_last_reload_success
  • traefik_config_last_reload_failure
  • traefik_entrypoint_requests_total
  • traefik_entrypoint_request_duration_seconds
  • traefik_entrypoint_open_connections
  • traefik_backend_requests_total
  • traefik_backend_request_duration_seconds
  • traefik_backend_open_connections
  • traefik_backend_retries_total
  • traefik_backend_server_up

mmatur avatar Apr 18 '18 08:04 mmatur

What do you mean exactly when you tell "per route"? Prometheus doesn't like "unbound" label values and so putting an arbitrary path into it is not an option if you don't want to kill your Prometheus server.

m3co-code avatar Apr 25 '18 17:04 m3co-code

as an example, i'd really like to know which API (based on a path) is slower than others when using the traefik_backend_requests_total metric. my services don't have an unbounded number of APIs so perhaps Traefik could only fill the label for backend requests that are not a 404 to avoid client spam.

Place1 avatar Mar 09 '19 00:03 Place1

Recently I moved from ingress-nginx controller to traefik, and only now I realised, that traefik unfortunately doesn't have any way to collect metrics by URL path. For example in ingress-nginx controller this behavior works by default: https://github.com/kubernetes/ingress-nginx/blob/96b6228a6b65a85e421b8a348a149e99181664d1/deploy/grafana/dashboards/request-handling-performance.json#L314

Traefik has a lot of pros like SSL certificates management and very comfortable routing, but observability of the system also is one of the most important thing, and now I can't understand which route of my API works slow.... Hope that the path label will be added to metrics in nearest future..

mgerasimchuk avatar Jun 12 '22 14:06 mgerasimchuk

I found the metric labels were more usable if I used an intermediate TraefikService in the path to my actual k8s service.

rayjanoka avatar Jun 12 '22 22:06 rayjanoka

Recently I moved from ingress-nginx controller to traefik, and only now I realised, that traefik unfortunately doesn't have any way to collect metrics by URL path. For example in ingress-nginx controller this behavior works by default: https://github.com/kubernetes/ingress-nginx/blob/96b6228a6b65a85e421b8a348a149e99181664d1/deploy/grafana/dashboards/request-handling-performance.json#L314

Traefik has a lot of pros like SSL certificates management and very comfortable routing, but observability of the system also is one of the most important thing, and now I can't understand which route of my API works slow.... Hope that the path label will be added to metrics in nearest future..

yes, the observability for every path of URL(route) is important for me too, even for alert/notification of traffic accident.

hick avatar Feb 09 '23 08:02 hick

Two years have passed, and I again tried to solve this problem, so, there is some good news.

This is how it's possible to collect path metrics with Traefik (spoiler: only requests_total, not histogram/latency).

  1. Create Middleware which provides the path as header

middleware.yaml:

apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: x-replace-path-metrics-header
spec:
  replacePathRegex:
    regex: ^(.*)
    replacement: $1

it does nothing with path, but as an artifact after applying this mw, we have an X-Replaced-Path header with our path(details).

  1. Apply the middleware for your IngressRoute

ingress.yaml:

apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: app
spec:
  entryPoints:
    - web
  routes:
    - match: Host(`app.co`)
      kind: Rule
      middlewares:
        - name: x-replace-path-metrics-header
      services:
        - kind: Service
          name: app
          port: 80
  1. Group your paths to avoid cardinality issues (it actually should be on the first place, cos before manifests apply you need to install Traefik helm chart, but I keep it at the 3rd to make the idea clear)

values.yaml (values for the traefik helm chart)

# Ask the Prometheus use our X-Replaced-Path header as a label
additionalArguments:
  # details here: https://doc.traefik.io/traefik/observability/metrics/prometheus/#headerlabels
  - '--metrics.prometheus.headerlabels.path=X-Replaced-Path'

metrics:
  prometheus:

    # Some basic not critical settings for the Prometheus
    addEntryPointsLabels: true
    addRoutersLabels: true
    addServicesLabels: true
    serviceMonitor:
      enabled: true
      additionalLabels:
        # release name of the Prometheus can be found here: kubectl -n <prometheus ns> get servicemonitors.monitoring.coreos.com kube-prometheus-stack-prometheus -o yaml | yq .metadata.labels.release
        release: kube-prometheus-stack

      # And finally the grouping, for example, you can squash URLs /assets/e2301aaa/bundle.js and /assets/e2301ccc/bundle.js into this one /assets/{hash}/bundle.js
      # In my case I don't describe the metricRelabelings field in the values, instead I set this value during the helm chart installation and generate the relabling config in a "gateway" application which  I interested in
      # So in result my traefik installation command looks like:
      # helm install traefik traefik/traefik -f values.yaml --set-json "metrics.prometheus.serviceMonitor.metricRelabelings=$(make -C ./../gateway generate-relabelings-config)"
      metricRelabelings:
        - sourceLabels: [ path ]
          regex: ^/assets/[0-9a-z]+/(.*)$
          targetLabel: path
          replacement: /assets/{hash}/$1
          action: replace

And almost all looks good, with this setup I can see the per-path(and actually per-host by adding --metrics.prometheus.headerlabels.host=X-Forwarded-Host into the additionalArguments) metrics of my requests. Image

There is only one problem, unfortunately, this setup works only for requests_total metrics, and doesn't work for the histogram metrics...

As a result, I can see the rate of request per-path, but I still can't see the latency per path..
Hopefully @leonlyu1996 did the greater job here - https://github.com/traefik/traefik/issues/10774 to add Histogram for the headerlabels metrics, hope the Traefik team will merge the @leonlyu1996's solution..

mgerasimchuk avatar Nov 26 '24 15:11 mgerasimchuk

Needing to add a new header to get this to work is a really ugly hack. C'mon, this should be pretty basic functionality available out of the box (and it is available out of the box w/ ingress-nginx, with ASP.NET Core's built-in metrics etc etc). All you really need is something in the docs to say "make damn sure you group your paths and understand their boundedness or you'll make your Prom server sad".

I really do reject this notion that it suddenly becomes a breaking change in the software because a user becomes able to do something dumb with it and cause themselves pain. If you play stupid games, you win stupid prizes. @Place1's suggestion for stripping out 404s is a sensible one; then you just need to have an understanding of the paths your routing framework will accept.

lol768 avatar Jan 14 '25 10:01 lol768

Hello,

Thanks for all the feedback! However, there is still something unclear to me @hick @lol768 @mgerasimchuk @Place1

Do you want the request path or the router's rule path to be added as a label to metrics?

rtribotte avatar Jan 14 '25 13:01 rtribotte

Hi @rtribotte ,

Ideally, we want to have an HTTP Request Path.

But at least for me, it would be also a good alternative if we would be able to have at least Traefik route Rule

mgerasimchuk avatar Jan 16 '25 05:01 mgerasimchuk

Hello, I raise my hand for this feature, especially for the information about the request path. I do not fully understand the problem with the route rule or http request path. IMHO the http request path should be used which is the current one passed to the metrics middleware.

jkblume avatar Mar 19 '25 08:03 jkblume