Traefik Router unable to communicate with meshed services when linkerd inbound policy is all-authenticated.
What is the issue?
I installed linkerd via helm chart in on-prem K3S cluster in linkerd namespace. I am using traefik ingress-controller which is deployed in traefik namespace. I have few microservices deployed in default namespace. I configured traefik router to access microservices from outside of cluster. eg: Traefik Router
routers:
example-service:
entryPoints:
- websecure
rule: "Host(`example.app.com`) && PathPrefix(`/`)"
tls:
certResolver: leresolver
service: example-service
services:
example-service:
loadBalancer:
servers:
- url: http://example-service.default.svc.cluster.local:8000
When linkerd deployed with defaultInboundPolicy: "all-unauthenticated", I can access all the microservices from browser.
proxy:
defaultInboundPolicy: "all-unauthenticated"
But, When deployed with defaultInboundPolicy: "all-authenticated", I can't access microservices from browser.
I am new to linkerd service mesh. I am unsure of the problem mentioned above.
How can it be reproduced?
- Provision K3S cluster.
- Install traefik in traefik namespace and annotate.
deployment:
podAnnotations:
linkerd.io/inject: ingress
- Install linkerd in linkerd namespace with below values.
proxy:
defaultInboundPolicy: "all-authenticated"
- Deploy an application in default namespace.
- Annotate default namespace with linkerd.io/inject=enabled
kubectl annotate namespace default linkerd.io/inject=enabled
- To inject the Linkerd sidecar, restart the pod in the default namespace.
- Create router in traefik in values.yaml.
routers:
example-service:
entryPoints:
- websecure
rule: "Host(`example.app.com`) && PathPrefix(`/`)"
tls:
certResolver: leresolver
service: example-service
services:
example-service:
loadBalancer:
servers:
- url: http://example-service.default.svc.cluster.local:8000
- In browser, try to access example.app.com I can't access the application.
Logs, error output, etc
logs from traefik linkerd-proxy container
[ 3083.814154s] INFO ThreadId(01) inbound:server{port=8443}: linkerd_app_inbound::policy::tcp: Connection denied server.group= server.kind=default server.name=all-authenticated tls=Some(Passthru { sni: ServerId(Name("example.app.com")) }) client=10.42.0.1:54877
[ 3083.814178s] INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=unauthorized connection on default/all-authenticated client.addr=10.42.0.1:54877 server.addr=10.42.0.58:8443
output of linkerd check -o short
linkerd-identity
----------------
‼ issuer cert is valid for at least 60 days
issuer certificate will expire on 2024-06-21T09:18:37Z
see https://linkerd.io/2/checks/#l5d-identity-issuer-cert-not-expiring-soon for hints
control-plane-version
---------------------
‼ control plane is up-to-date
unsupported version channel: stable-2.14.10
see https://linkerd.io/2/checks/#l5d-version-control for hints
‼ control plane and cli versions match
control plane running stable-2.14.10 but cli running edge-24.6.2
see https://linkerd.io/2/checks/#l5d-version-control for hints
linkerd-control-plane-proxy
---------------------------
‼ control plane proxies are up-to-date
some proxies are not running the current version:
* linkerd-identity-6dbb555cf7-q9c8g (stable-2.14.10)
* metrics-api-b85485b99-2cbkt (stable-2.14.10)
* web-58979b9448-72znj (stable-2.14.10)
* tap-injector-c48598d4c-chc25 (stable-2.14.10)
* tap-7999d688ff-kgzqh (stable-2.14.10)
* linkerd-proxy-injector-7f6964c9b9-fx8vx (stable-2.14.10)
* linkerd-destination-5dc7694bc5-t4glt (stable-2.14.10)
see https://linkerd.io/2/checks/#l5d-cp-proxy-version for hints
‼ control plane proxies and cli versions match
linkerd-identity-6dbb555cf7-q9c8g running stable-2.14.10 but cli running edge-24.6.2
see https://linkerd.io/2/checks/#l5d-cp-proxy-cli-version for hints
linkerd-viz
-----------
‼ viz extension proxies are up-to-date
some proxies are not running the current version:
* linkerd-identity-6dbb555cf7-q9c8g (stable-2.14.10)
* metrics-api-b85485b99-2cbkt (stable-2.14.10)
* web-58979b9448-72znj (stable-2.14.10)
* tap-injector-c48598d4c-chc25 (stable-2.14.10)
* tap-7999d688ff-kgzqh (stable-2.14.10)
* linkerd-proxy-injector-7f6964c9b9-fx8vx (stable-2.14.10)
* linkerd-destination-5dc7694bc5-t4glt (stable-2.14.10)
see https://linkerd.io/2/checks/#l5d-viz-proxy-cp-version for hints
‼ viz extension proxies and cli versions match
linkerd-identity-6dbb555cf7-q9c8g running stable-2.14.10 but cli running edge-24.6.2
see https://linkerd.io/2/checks/#l5d-viz-proxy-cli-version for hints
Status check results are √
Environment
$ linkerd version Client version: edge-24.6.2 Server version: stable-2.14.10
$ helm version version.BuildInfo{Version:"v3.11.3", GitCommit:"323249351482b3bbfc9f5004f65d400aa70f9ae7", GitTreeState:"clean", GoVersion:"go1.20.3"}
$ kubectl version --short Client Version: v1.27.1 Kustomize Version: v5.0.1 Server Version: v1.25.6+k3s1
Cluster type: Single node on-prem K3S
Ingress Controller: Traefik v2.9.8
Possible solution
No response
Additional context
No response
Would you like to work on fixing this bug?
None
You'll need some extra config to get Traefik to play nice with Linkerd. Please check the detailed instructions in the docs
I tried adding middleware but still the same problem.
Below is the snippet of traefik configmap.
http:
middlewares:
l5d-header:
headers:
customRequestHeaders:
l5d-dst-override: "example-service.default.svc.cluster.local:8000"
routers:
example-service:
entryPoints:
- websecure
rule: "Host(`example.app.com`) && PathPrefix(`/`)"
middleware:
- l5d-header
tls:
certResolver: leresolver
service: example-service
services:
example-service:
loadBalancer:
servers:
- url: http://example-service.default.svc.cluster.local:8000
Note: I am using letsencrypt certResolver in traefik for TLS.
@palashbasik Have you meshed Traefik using linkerd.io/inject: ingress? It's not hard to miss that bit in our docs for Traefik v2... 😐
@palashbasik I see that you listed that you're using ingress mode above, it's worth doublechecking. 🙂 But also: instead of the Traefik configmap, can we see the YAML you're configuring Traefik with?
Below is the override-values.yaml file for Traefik.
deployment:
replicas: 3
podAnnotations:
linkerd.io/inject: ingress
# Pod disruption budget
podDisruptionBudget:
enabled: true
# maxUnavailable: 1
# maxUnavailable: 33%
minAvailable: 1
# minAvailable: 25%
# Enable experimental features
experimental:
v3:
enabled: true
plugins:
enabled: true
# Create an IngressRoute for the dashboard
ingressRoute:
dashboard:
enabled: true
## Logs
## https://docs.traefik.io/observability/logs/
logs:
## Traefik logs concern everything that happens to Traefik itself (startup, configuration, events, shutdown, and so on).
general:
# By default, the logs use a text format (common), but you can also ask for the json format in the format option
# format: json
# By default, the level is set to ERROR.
# Alternative logging levels are DEBUG, PANIC, FATAL, ERROR, WARN, and INFO.
level: INFO
access:
# To enable access logs
enabled: true
## By default, logs are written using the Common Log Format (CLF) on stdout.
## To write logs in JSON, use json in the format option.
format: json
# filePath: "/var/log/traefik/access.log
## To write the logs in an asynchronous fashion, specify a bufferingSize option.
## This option represents the number of log lines Traefik will keep in memory before writing
## them to the selected output. In some cases, this option can greatly help performances.
# bufferingSize: 100
## Filtering https://docs.traefik.io/observability/access-logs/#filtering
filters: {}
# statuscodes: "200,300-302"
# retryattempts: true
# minduration: 10ms
## Fields
## https://docs.traefik.io/observability/access-logs/#limiting-the-fieldsincluding-headers
fields:
general:
defaultmode: keep
names:
StartUTC: drop
StartLocal: drop
RouterName: drop
ServiceAddr: drop
ClientPort: drop
ClientUsername: drop
RequestHost: drop
RequestPort: drop
RequestMethod: drop
RequestPath: drop
RequestProtocol: drop
RequestScheme: drop
RequestContentSize: drop
OriginDuration: drop
OriginContentSize: drop
OriginStatus: drop
OriginStatusLine: drop
DownstreamStatusLine: drop
RequestCount: drop
GzipRatio: drop
Overhead: drop
TLSVersion: drop
TLSCipher: drop
metrics:
## Prometheus is enabled by default.
## It can be disabled by setting "prometheus: null"
prometheus:
## Entry point used to expose metrics.
entryPoint: metrics
addEntryPointsLabels: true
addRoutersLabels: true
addServicesLabels: true
## Buckets for latency metrics. Default="0.1,0.3,1.2,5.0"
# buckets: "0.5,1.0,2.5"
## When manualRouting is true, it disables the default internal router in
## order to allow creating a custom router for prometheus@internal service.
# manualRouting: true
tracing:
jaeger:
collector:
endpoint: http://jaeger-collector.monitoring.svc.cluster.local:14268/api/traces
secret:
enabled: true
# Environment variables to be passed to Traefik's binary
env:
- name: CLOUDFLARE_EMAIL
value: <your-email-id>
- name: CLOUDFLARE_API_KEY
valueFrom:
secretKeyRef:
name: traefik-secret
key: CLOUDFLARE_API_KEY
# Configure ports
ports:
web:
expose: false
websecure:
# Enable this entrypoint as a default entrypoint. When a service doesn't explicity set an entrypoint it will only use this entrypoint.
# asDefault: true
tls:
enabled: true
# this is the name of a TLSOption definition
# options: ""
certResolver: "leresolver"
# domains: []
## Create HorizontalPodAutoscaler object.
##
autoscaling:
enabled: true
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 50
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Pods
value: 1
periodSeconds: 60
# Enable persistence using Persistent Volume Claims
# ref: http://kubernetes.io/docs/user-guide/persistent-volumes/
# It can be used to store TLS certificates, see `storage` in certResolvers
persistence:
enabled: true
certResolvers:
leresolver:
# for challenge options cf. https://doc.traefik.io/traefik/https/acme/
email: <your-email-id>
dnsChallenge:
# also add the provider's required configuration under env
# or expand then from secrets/configmaps with envfrom
# cf. https://doc.traefik.io/traefik/https/acme/#providers
provider: cloudflare
# add futher options for the dns challenge as needed
# cf. https://doc.traefik.io/traefik/https/acme/#dnschallenge
delayBeforeCheck: 30
resolvers:
- 1.1.1.1
- 8.8.8.8
tlsChallenge: false
# httpChallenge:
# entryPoint: "web"
# It has to match the path with a persistent volume
storage: /data/acme.json
additionalArguments:
- "--providers.file.filename=/config/config.yaml"
volumes:
- name: '{{ printf "%s-configs" .Release.Name }}'
mountPath: '/config'
type: configMap
resources:
requests:
cpu: "100m"
memory: "1Gi"
limits:
cpu: "500m"
memory: "2Gi"
config: |-
http:
middlewares:
corsHeader:
headers:
accessControlAllowCredentials: true
accessControlAllowHeaders:
- Accept
- Access-Control-Request-Headers
- Access-Control-Request-Method
- Authorization
- Content-Type
- Last-Modified
- Origin
- X-Requested-With
- Sec-WebSocket-Key
accessControlAllowMethods: "*"
accessControlAllowOriginList:
- http://localhost:3000
accessControlMaxAge: 100
addVaryHeader: true
basic-admin-auth:
basicAuth:
users:
# password - password - hashed with bcrypt
- "admin:$2a$12$fpgiRwj7e2XBv/U4LWDvr.Jr7sRPECklDxitBdXDkBzLS6r4TU5Pm"
strip-service-prefix:
# Modifies "/team/hello" to "/hello"
replacePathRegex:
regex: '^/$1/$1/(.*)'
#regex: '^/.*?/(.*)'
replacement: '/$1'
routers:
example-service:
entryPoints:
- websecure
# Should prevent any route containing the word "internal" to be blocked
rule: "Host(`example.app.com`) && PathPrefix(`/`)"
middlewares:
- strip-service-prefix
tls:
certResolver: leresolver
service: example-service
services:
# Define how to reach an existing service on our infrastructure
example-service:
loadBalancer:
servers:
- url: http://example-service.default.svc.cluster.local:8000
With the provided Traefik configuration and Linkerd deployed with the defaultInboundPolicy set to "all-authenticated", I can't access https://example.app.com from browser.
Note: The host example.app.com mentioned above is solely for illustrative purposes.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
@palashbasik Have you meshed Traefik using
linkerd.io/inject: ingress? It's not hard to miss that bit in our docs for Traefik v2... 😐
https://linkerd.io/2.16/tasks/using-ingress/#traefik-normal-mode says Traefik v2+ to not use that annotation