linkerd2 authentication policy for traffic from a pod's own ClusterIp

What problem are you trying to solve?

In some cases, it is desirable to allow traffic that originates from a pod to another container in that pod, without allowing traffic from outside of that pod. This cannot currently be done using the NetworkAuthentication resource if the authentication is created before the pod, since the pod's IP will not be known at that time.

In addition, creating the NetworkAuthentication after the pod is created is not always a workable solution, either. It requires some amount of manual intervention (creating the NetworkAuthentication resource with that pod's specific IP). And, if we want a policy that allows traffic from a pod to itself to apply to more than one specific pod (e.g. if there are multiple replicas in a deployment, or if pods are created and destroyed due to a rollout or eviction), hard-coding the IP of a specific pod is not sufficient.

The motivating example is linkerd-viz's Prometheus pod(s). The Prometheus container must be able to scrape its own proxy to collect metrics about traffic originating from the Prometheus pod itself. If we want to create an AuthorizationPolicy that only permits Prometheus to scrape proxy metrics from proxies in the linkerd-viz namespace, and we use a MeshTLSAuthentication for the prometheus ServiceAccount, that will not authenticate scrapes from Prometheus to its own proxy, since this traffic is not mTLS'd. Unfortunately, we cannot use a NetworkAuthentication here either. The client address the proxy observes for these connections originate from the pod's ClusterIp, rather than from loopback. Because the ClusterIp is not known in advance until the pod is created, and may change if the pod is evicted and rescheduled, the only valid NetworkAuthentication that would authenticate this traffic is one that allows all networks to access the proxy's metrics endpoint. This is unfortunate, as the Prometheus proxy's metrics can be used to enumerate metadata on every pod in the cluster, which we might reasonably want to restrict.

As a worked example of this scenario, if we create the following policy resources in the linkerd-viz namespace:

---
apiVersion: policy.linkerd.io/v1beta1
kind: Server
metadata:
  name: proxy-admin
  namespace: linkerd-viz
spec:
  podSelector:
    matchExpressions:
    - key: linkerd.io/control-plane-ns
      operator: Exists
  port: linkerd-admin
  proxyProtocol: HTTP/1
---
apiVersion: policy.linkerd.io/v1alpha1
kind: HTTPRoute
metadata:
  name: proxy-metrics
  namespace: linkerd-viz
spec:
  parentRefs:
    - name: proxy-admin
      kind: Server
      group: policy.linkerd.io
  rules:
    - matches:
      - path:
          value: "/metrics"
---
apiVersion: policy.linkerd.io/v1alpha1
kind: AuthorizationPolicy
metadata:
  name: prometheus-scrape
  namespace: linkerd-viz
spec:
  targetRef:
    group: policy.linkerd.io
    kind: HTTPRoute
    name: proxy-metrics
  requiredAuthenticationRefs:
    - kind: ServiceAccount
      name: prometheus
      namespace: linkerd-viz

and then run commands like linkerd viz stat or linkerd viz edges, we will see that metrics exist for other pods in the linkerd-viz namespace, but not for traffic originating from the Prometheus pod.

If we inspect the logs from the Prometheus pod's proxy, we see a number of messages like this:

[   784.507178s]  INFO ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:rescue{client.addr=10.42.0.6:52344}: linkerd_app_core::errors::respond: Request failed error=unauthorized request on route
[   794.507521s]  INFO ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}: linkerd_app_inbound::policy::http: Request denied server.group=policy.linkerd.io server.kind=server server.name=proxy-admin route.group=policy.linkerd.io route.kind=HTTPRoute route.name=proxy-metrics client.tls=None(NoClientHello) client.ip=10.42.0.6

indicating that the proxy is denying requests from Prometheus to its own proxy, because these requests do not have mutual TLS (as Prometheus itself cannot participate in mesh TLS, and only the proxy can originate mesh TLS on its behalf). Therefore, they don't match the required authentication for the prometheus ServiceAccount.

How should the problem be solved?

We should add an authentication policy that matches traffic originating from a pod's own ClusterIp. In my opinion, this probably makes the most sense as a new variant of the NetworkAuthentication resource, but it could also be a new resource type.

I believe a feature like this could be implemented entirely in the policy controller, without requiring changes to the proxy or proxy-api. The policy controller does know the ClusterIP of the pod that a proxy queries it from, so if it sees the authentication policy for the pod's own ClusterIp, it can just send that proxy an Authz message with the networks field set to include a network that matches that IP exactly (plus any other networks that would apply).

Any alternatives you've considered?

An alternative solution to this problem would be to create some kind of operator that watches a set of pods, and creates/updates NetworkAuthentication resources that match those specific pods' own IPs. However, this is significantly more work to implement, and, while it could solve the problem for linkerd-viz specifically, it wouldn't provide a generally applicable solution: Linkerd users who want to create similar policies would need to implement their own similar operators, which seems like a shame.
Another alternative is to decide that proxies should always allow traffic from their own pod IPs. In that case, we could just implement this in the proxy (which also knows its own cluster IP via an env var), which might be a bit simpler than adding it to the policy controller. But, this would mean we always permit this, and it isn't configurable...
As I mentioned, I think it makes the most sense to implement something like this as a new variant of the NetworkAuthentication resource. But we could, alternatively, create a whole new authentication policy resource type for it. This seems more complicated and I don't really know if it would make sense; logically, this would also be a form of network-based authentication.

How would users interact with this feature?

We would add a new type of network value to the NetworkAuthentication resource to represent an authentication that matches a pod's own ClusterIP. Then, users could create NetworkAuthentications like:

apiVersion: policy.linkerd.io/v1alpha1
kind: NetworkAuthentication
metadata:
  name: local-ip
spec:
  networks:
  - podClusterIp # naming subject to bikeshedding, of course.

This could also be combined with existing network values, as in:

apiVersion: policy.linkerd.io/v1alpha1
kind: NetworkAuthentication
metadata:
  name: complicated-authn
spec:
  networks:
  - podClusterIp
  - cidr: 100.64.0.0/10
  - cidr: 172.16.0.0/12
  - cidr: 192.168.0.0/16
  - except:
      - cidr: 192.168.0.17/32

Would you like to work on this feature?

yes

Sep 01 '22 19:09 hawkw

@olix0r pointed out that we may alternatively just want to always allow traffic from a pod's own IP, in which case this could be implemented pretty simply in the proxy.[^1] I've added that to the list of alternative solutions.

[^1]: which knows its own IP via an env variable.

Sep 02 '22 16:09 hawkw

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

Dec 02 '22 06:12 stale[bot]

linkerd2 linkerd2 copied to clipboard

authentication policy for traffic from a pod's own ClusterIp

What problem are you trying to solve?

How should the problem be solved?

Any alternatives you've considered?

How would users interact with this feature?

Would you like to work on this feature?

linkerd2
linkerd2 copied to clipboard