False errors raised when configuring Traefik through CLI, referencing no yet discovered service (global errors middleware)
Welcome!
- [x] Yes, I've searched similar issues on GitHub and didn't find any.
- [X] Yes, I've searched similar issues on the Traefik community forum and didn't find any.
What did you do?
For our project, we configure Traefik through CLI. For each entrypoint (here: http, https, traefik), we add the errors middleware which rely on a service configured through the Docker provider (or Swarm provider), so, it is referenced through CLI as error-pages@docker, or when using the Swarm provider, as error-pages@swarm.
All is working as it should at the end (see the provided screenshots), but when we start the Traefik instance, there is many (see below) FALSE errors raised. Of course, from a strict point of view, the errors are expected because at the time Traefik is resolving the static configuration, the error-pages@docker middleware has not been yet discovered. The thing is, those errors are not really TRUE, in sense that once the Docker services were discovered, all is working as it should.
So, the idea here would be to defer the configuration processing for options relying on services referenced as @docker or @swarm, meaning that those would be resolved and applied later on (i.e. after the first discovering processing). Of course, error handling should be still part of the process (eg, if after the first discovering, services are still not defined the errors should be raised in fatal way.
Now, if the fact to reference a not yet discovered service should not be allowed, the error should be FATAL instead. We know we could use the file provider in place, but ...
What did you see instead?
iam-reverse-proxy-1 | 2024-09-30T19:57:33Z ERR error="middleware \"error-pages@docker\" does not exist" entryPointName=http routerName=acme-http@internal
iam-reverse-proxy-1 | 2024-09-30T19:57:33Z ERR error="middleware \"error-pages@docker\" does not exist" entryPointName=traefik routerName=api@internal
iam-reverse-proxy-1 | 2024-09-30T19:57:33Z ERR error="middleware \"error-pages@docker\" does not exist" entryPointName=traefik routerName=dashboard@internal
iam-reverse-proxy-1 | 2024-09-30T19:57:33Z ERR error="middleware \"error-pages@docker\" does not exist" entryPointName=traefik routerName=ping@internal
iam-reverse-proxy-1 | 2024-09-30T19:57:33Z ERR error="middleware \"error-pages@docker\" does not exist" entryPointName=http routerName=acme-http@internal
iam-reverse-proxy-1 | 2024-09-30T19:57:33Z ERR error="middleware \"error-pages@docker\" does not exist" entryPointName=traefik routerName=dashboard@internal
iam-reverse-proxy-1 | 2024-09-30T19:57:33Z ERR error="middleware \"error-pages@docker\" does not exist" entryPointName=traefik routerName=ping@internal
iam-reverse-proxy-1 | 2024-09-30T19:57:33Z ERR error="middleware \"error-pages@docker\" does not exist" entryPointName=traefik routerName=api@internal
What version of Traefik are you using?
nuxwin@srv02:~/projects/git/agon-innovation/iam-helper/resources/traefik$ docker compose run iam-reverse-proxy traefik version
[+] Creating 1/0
✔ Container traefik-iam-error-pages-1 Running 0.0s
Version: 3.1.4
Codename: comte
Go version: go1.23.1
Built: 2024-09-19T13:47:17Z
OS/Arch: linux/amd64
What is your environment & configuration?
Our Docker compose file, which configure Traefik using CLI options:
services:
# Traefik reverse proxy service.
iam-reverse-proxy:
image: traefik:v3.1
command:
- /bin/sh
- -c
- |
mkdir -p ${IAM_HELPER_CACHE_DIR:-/opt/iam-helper/.cache}/traefik/dynamic/ \
&& touch ${IAM_HELPER_CACHE_DIR:-/opt/iam-helper/.cache}/traefik/acme.json \
&& chmod 0600 ${IAM_HELPER_CACHE_DIR:-/opt/iam-helper/.cache}/traefik/acme.json \
&& exec traefik \
--log=true \
--log.level='ERROR' \
--log.filepath='/dev/stdout' \
--log.format='common' \
--accesslog='true' \
--accesslog.addinternals='true' \
--accesslog.filepath='/dev/stdout' \
--accesslog.format='common' \
--api=true \
--api.dashboard='true' \
--api.disabledashboardad='true' \
--api.insecure='true' \
--ping=true \
--ping.entrypoint='traefik' \
--entrypoints.traefik.address=':8080' \
--entrypoints.traefik.http.middlewares='error-pages@docker' \
--entrypoints.http.address=':80' \
--entrypoints.http.http.middlewares='error-pages@docker' \
--entryPoints.http.forwardedHeaders.insecure='true' \
--entrypoints.http.http.encodequerysemicolons='true' \
--entryPoints.http.http2.maxConcurrentStreams=50 \
--entrypoints.https.address=':443' \
--entrypoints.https.http.tls='true' \
--entrypoints.https.http.middlewares='error-pages@docker' \
--entryPoints.https.forwardedHeaders.insecure=true \
--entrypoints.https.http.encodequerysemicolons=true \
--entryPoints.https.http2.maxConcurrentStreams=50 \
--providers.docker='true' \
--providers.docker.constraints='Label(`iam.traefik.enable`, `true`)' \
--providers.docker.exposedbydefault='false' \
--providers.docker.watch='true' \
--providers.docker.network='${IAM_NETWORK_NAME:-agon_iam_private}' \
--providers.docker.endpoint='unix:///var/run/docker.sock' \
--providers.file.directory='${IAM_HELPER_CACHE_DIR:-/opt/iam-helper/.cache}/traefik/dynamic/' \
--providers.file.watch='true' \
--certificatesresolvers.letsencrypt.acme.email='${LETSENCRYPT_ACCOUNT_EMAIL:-no_account}' \
--certificatesresolvers.letsencrypt.acme.httpchallenge='true' \
--certificatesresolvers.letsencrypt.acme.storage='${IAM_HELPER_CACHE_DIR:-/opt/iam-helper/.cache}/traefik/acme.json' \
--certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint='http'
ports:
# Entrypoints configuration (http).
# Listen on port 80 (default) in the host and forward to port 80 in the container.
- target: 80
published: ${TRAEFIK_HTTP_PORT:-80}
protocol: tcp
mode: host
# Entrypoints configuration (https).
# Listen on port 443 (default) in the host and forward to port 443 in the container.
- target: 443
published: ${TRAEFIK_HTTPS_PORT:-443}
protocol: tcp
mode: host
# Entrypoints configuration (traefik).
# Listen on port 8080 (default) in the host and forward to port 8080 in the
# container. Used for the Traefik dashboard.
- target: 8080
published: ${TRAEFIK_DASHBOARD_PORT:-8080}
protocol: tcp
mode: host
stop_grace_period: 30s
healthcheck:
test: 'wget -qO- http://127.0.0.1:8080/ping || exit 1'
interval: 30s
timeout: 3s
retries: 5
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- agon-iam-helper-cache:${IAM_HELPER_CACHE_DIR:-/opt/iam-helper/.cache}
- $HOME/.iam/traefik:${IAM_HELPER_CACHE_DIR:-/opt/iam-helper/.cache}/traefik
networks:
- iam_private_network
depends_on:
- iam-error-pages
# Error pages service for the reverse proxy.
iam-error-pages:
image: tarampampam/error-pages:3
networks:
- iam_private_network
labels:
- traefik.enable=true
- iam.traefik.enable=true
# Setup middlewares to use for this service.
- traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https
- traefik.http.middlewares.redirect-to-https.redirectscheme.port=${TRAEFIK_HTTPS_PORT:-443}
- traefik.http.middlewares.redirect-to-https.redirectscheme.permanent=false
- traefik.http.middlewares.gzip.compress=true
# Setup HTTP router for this service.
- traefik.http.routers.http-error-pages.entrypoints=http
- traefik.http.routers.http-error-pages.middlewares=redirect-to-https
- traefik.http.routers.http-error-pages.rule=HostRegexp(`.+`)
- traefik.http.routers.http-error-pages.priority=1
# Setup HTTPS router for this service.
- traefik.http.routers.https-error-pages.entrypoints=https
- traefik.http.routers.https-error-pages.middlewares=gzip
- traefik.http.routers.https-error-pages.rule=HostRegexp(`.+`)
- traefik.http.routers.https-error-pages.priority=1
- traefik.http.routers.https-error-pages.service=https-error-pages
# Setup service for this service.
- traefik.http.services.https-error-pages.loadbalancer.server.port=8080
# Setup error middleware which is used globally for error handling.
- traefik.http.middlewares.error-pages.errors.status=401-599
- traefik.http.middlewares.error-pages.errors.service=https-error-pages
- traefik.http.middlewares.error-pages.errors.query=/{status}.html
stop_grace_period: 30s
environment:
TEMPLATE_NAME: connection
networks:
iam_private_network:
name: ${IAM_NETWORK_NAME:-agon_iam_private}
external: true
volumes:
agon-iam-helper-cache:
name: agon-iam-helper-cache
external: true
Hello @nuxwin,
The behavior you are experiencing is expected. I'm not sure if you've read https://github.com/traefik/traefik/issues/6846, but the conversation there is a good read in this regard.
Can you elaborate on what you think is wrong with today's potential consistency of dynamic configuration when mixing resources for different providers (other than error logging)? What could we do instead? You mentioned being fatal, but what would that mean? How could we improve our documentation in this regard?
@rtribotte
Good morning. Thank for your time.
What I meant is that if it is expected to have a dynamic configuration which is not yet resolved at time the static configuration is being applied, and when the static configuration reference services or middlewares provided through dynamic configuration, no error should be raised until the first discovering processing has been done. Basically :
Static configuration resolution stage
Do we have dynamic configuration referenced in static configuration (@file, @swarm, @docker ...) ?
- Yes: Postpone the configuration and don't raise error yet for undefined services/middlewares as dynamic configuration (first discovering) has not been triggered yet.
- No: Continue as normal.
Dynamic configuration discovering
Do we have postponed configuration due to referenced dynamic configuration ?
- Yes: then try to resolve it and raise error here if one configuration is undefined.
- No: continue as normal.
For me those errors are confusing and inappropriate because there is nothing wrong finally.
Another approach should be to raise a warning instead of error in such condition, and only raise an error when, after discovering the dynamic configuration for each enabled provider, the configuration is still not resolvable.
I think, a warning would be the way to go because in environments like docker, services could be discovered lately, not at the first discovering attempts. I don't really know who the dynamic configuration watchers are working exactly so ...
Thanks @nuxwin for the thorough feedback.
To be honest, I keep thinking that this log is an expected side effect of the eventual consistency of the applied dynamic configuration. I understand your point and where it stands, from the "static part" of the dynamic configuration (model on entryPoints) we could infer which are the required providers (which is already a concept and is currently the internal one). But it would also mean that until these required providers produce their configurations, no configuration could be applied, no matter if another entryPoint has no "model" configuration, which is a potentially huge downside. Also, the required provider could produce a configuration that does not yet contain the required resources, and this would produce the same behavior you are experiencing.
I'm not sure this is related to your need here, but this also make me think of the feature request for a readiness endpoint: https://github.com/traefik/traefik/issues/10458.
Anyway, this is food for thought, we will discuss this with other maintainers during next triage.
We have also noticed these error messages. We create the middlewares as Kubernetes custom resources. These are then used on Kubernetes Ingress resources using traefik.ingress.kubernetes.io/router.middlewares annotation (e.g. test-middleware@kubernetescrd).
We still see the error in the log that the middleware cannot be found. Nevertheless, the middleware is loaded (later?) and works on the respective ingress. Unfortunately, these errors in the log are distracting and “real” errors are easily overlooked.
yea, I spent the better part of a day trying to get to the bottom of these errors thinking they were infact indication of a real problem. Pains to me to find out they are just noise and can be ignored.... really hope this is cleaned up at some point soon. Thanks
I can confirm this is still an issue: there is a race between middleware discovery and ingress discovery where if the ingress gets loaded first then you get confusing error messages.
Potentially when loading an ingress with middlewares we could check if there is a middleware fetch in progress, and if so, wait?