linkerd2 icon indicating copy to clipboard operation
linkerd2 copied to clipboard

Linkerd exposes all ports on pod regardless of Service definition

Open FredrikAugust opened this issue 8 months ago • 6 comments

What is the issue?

Hey, contrary to default Kubernetes behavior, linkerd allows you to reach any port on a meshed pod from another meshed pod regardless of whether or not that port is exposed in the attached Service. This can lead to bugs as you would normally not expect to be able to reach a pod without the service explicitly allowing you.

How can it be reproduced?

I've created a very simple reproduction here: https://github.com/FredrikAugust/linkerd-implicit-svc-port-exposing-repro

Logs, error output, etc

See the attached repository. The interesting part is the fact that you can reach the pod container port. So I don't know what logs to attach.

output of linkerd check -o short

This is from a fresh linkerd install so it's all OK.

Environment

We first experienced this in our GKE cluster, but the repro uses a K3d (k3s) cluster so I don't think the environment is relevant here.

Possible solution

It should respect the attached Service and only allow connections on those ports.

Additional context

I found this post which appears to be related: https://linkerd.buoyant.io/t/proxy-accepting-telnet-on-any-port/459.

Would you like to work on fixing this bug?

None

FredrikAugust avatar Apr 10 '25 09:04 FredrikAugust

I understand that this behavior is surprising, but this is the nature of the way that iptables-based rerouting works. You're not quite correct that Kubernetes requires that all ports are enumerated on a Service! Ports are in fact optional (because Services are typically discovered via A record lookups that return an IP address), the client is able to connect to any port on which the server is listening.

When the server is meshed in Linkerd, however, the sidecar proxy must accept all inbound TCP connections to even know which port the connection targets!

It's possible you could alter this behavior by modifying your proxy-init configuration to skip the proxy on ports that you do not wish to serve inbound traffic on. This would not be straightforward to automate in Linkerd, however, as Pod-level port documentation is also optional.

olix0r avatar Apr 15 '25 15:04 olix0r

I see @olix0r. That makes sense. Would it be possible for Linkerd to check which ports are enumerated in the Svc and "manually" block the rest, or does that go against the design philosophy?

FredrikAugust avatar Apr 22 '25 08:04 FredrikAugust

Also, if this is meant to stay this way, perhaps you could put up some warning signs on the documentation? This was quite hard to debug as there is no real docs on it.

FredrikAugust avatar Apr 22 '25 08:04 FredrikAugust

@olix0r

You're not quite correct that Kubernetes requires that all ports are enumerated on a Service! Ports are in fact optional (because Services are typically discovered via A record lookups that return an IP address), the client is able to connect to any port on which the server is listening.

This is what I was told, but I'm not able to reproduce that behavior when testing locally.

Creating a k3d cluster with the following nginx definition

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      annotations:
        linkerd.io/inject: disabled
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: nginx
          ports:
            - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx
spec:
  selector:
    app: nginx
  ports:
    - protocol: TCP
      port: 1234
      targetPort: 1234

I'm not able to get a response when executing curl nginx.default.svc.cluster.local:80. It times out after 75 seconds. curl-ing on port 1234, however, fails immediately.

Log output from curl on svc-port with no listening port.

tmp-shell:~# curl nginx.default.svc.cluster.local:1234 -v
* Host nginx.default.svc.cluster.local:1234 was resolved.
* IPv6: (none)
* IPv4: 10.43.237.84
*   Trying 10.43.237.84:1234...
* connect to 10.43.237.84 port 1234 from 10.42.0.21 port 57048 failed: Connection refused
* Failed to connect to nginx.default.svc.cluster.local port 1234 after 3 ms: Couldn't connect to server
* Closing connection
curl: (7) Failed to connect to nginx.default.svc.cluster.local port 1234 after 3 ms: Couldn't connect to server

Log output from curl on port which server is listening to, but that's not included in the Svc.

tmp-shell:~# curl nginx.default.svc.cluster.local:80 -v
* Host nginx.default.svc.cluster.local:80 was resolved.
* IPv6: (none)
* IPv4: 10.43.237.84
*   Trying 10.43.237.84:80...
* connect to 10.43.237.84 port 80 from 10.42.0.21 port 51548 failed: Connection refused
* Failed to connect to nginx.default.svc.cluster.local port 80 after 75007 ms: Couldn't connect to server
* Closing connection
curl: (7) Failed to connect to nginx.default.svc.cluster.local port 80 after 75007 ms: Couldn't connect to server

Changing 1234 to point to port 80 on container of course works as expected.

tmp-shell:~# curl nginx.default.svc.cluster.local:1234
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
...

Now, if I simply mesh both pods, things change. At this point I've reverted the Svc to map 1234 to 1234 (non-existent port on server).

Image

I also thought you might mean that you can define a svc without the ports to "expose all", but that doesn't seem to work either.

Image

FredrikAugust avatar Apr 24 '25 08:04 FredrikAugust

Hi, @olix0r. Did you have time to look at my response?

FredrikAugust avatar May 26 '25 09:05 FredrikAugust

Can confirm I've experienced same on latest BEL running in AWS EKS.

steve-gray avatar Jun 05 '25 19:06 steve-gray