kiam icon indicating copy to clipboard operation
kiam copied to clipboard

Question: drop existing TCP connections to Instance Metadata API

Open moolen opened this issue 5 years ago • 1 comments

This is probably related to #48. It happened to me after a cluster rollover.

My Scenario: I start a pod. The application inside uses the aws sdk which tries to fetch credentials using the Instance Metadata API.

If kiam is installed the iptables DNAT will do it's magic and the app gets the credentials. Everything works and we're happy :tada: And If kiam is not Installed it should fail for obvious reasons. so far so good.

But what happens, if kiam is not yet installed (e.g. in a node reboot scenario) and a pod which wants to use kiam starts before the agent ds is running. It opens up a connection to the instance metadata service and tries to fetch the credentials. It will fail, of course. Once kiam is started and the iptables DNAT rule is in place the following happens: The TCP connection between the pod and the Metadata API is ESTABLISHED. Hence the DNAT will not affect existing connections and we have to either kill all connections to the Metadata API or restart the pod.

This - of course - depends on the application. If it tries to assume a role or fetch credentials every 30s while having a TCP connection timeout of 60s the connection will likely never be terminated (happens to me using cloudwatch-exporter and external-dns)

Possible workarounds:

  • init container that does some checkups
  • killall (metadata API) TCP connections :trollface:
  • restart pods (is that even an option?)

Question:

How can we handle this case? Does anyone experience the same problem?

How to reproduce it locally (without AWS)

Create testing application. it's available under docker.io/moolen/aws-whoami:

// taken from aws sdk examples
package main

import (
	"fmt"
	"time"

	"github.com/aws/aws-sdk-go/aws/awserr"
	"github.com/aws/aws-sdk-go/aws/session"
	"github.com/aws/aws-sdk-go/service/sts"
)

func main() {
	sess := session.New()
	svc := sts.New(sess)

	for {
		input := &sts.GetCallerIdentityInput{}
		result, err := svc.GetCallerIdentity(input)
		if err != nil {
			if aerr, ok := err.(awserr.Error); ok {
				switch aerr.Code() {
				default:
					fmt.Println(aerr.Error())
				}
			} else {
				// Print the error, cast err to awserr.Error to get the Code and
				// Message from an error.
				fmt.Println(err.Error())
			}
			<-time.After(time.Second * 5)
			continue
		}

		fmt.Println(result)
		<-time.After(time.Second * 5)
	}

}

Dockerfile

FROM alpine:3.10
RUN apk add --no-cache curl wget musl-dev bind-tools
COPY ./tester /usr/bin/tester

ENTRYPOINT [ "/usr/bin/tester" ]

And do roughly these steps on a minikube:

  • run tester image from above
  • ssh to minikube, using toolbox install: curl, iptables, conntrack, iproute
  • add metadata ip ip a a 169.254.169.254 dev eth0
  • make sure something is running on :80 curl -i http://localhost:80 (docker proxy was for me)

If the tester pod runs you should see ESTABLISHED connections

$ conntrack -L conntrack | grep 169.
tcp      6 86399 ESTABLISHED src=172.17.0.6 dst=169.254.169.254 sport=33114 dport=80 src=169.254.169.254 dst=172.17.0.6 sport=80 dport=33114 [ASSURED] mark=0 use=1

there should be no iptables rule

$ iptables-save | grep 169
  > empty

Now either install the kiam-agent (which creates this rule) or manuall add it using iptables.

$ iptables-save | grep 169.254
-A PREROUTING -d 169.254.169.254/32 -i cali+ -p tcp -m tcp --dport 80 -j DNAT --to-destination 192.168.122.37:8181

Now check the conntrack table again:

$ conntrack -L conntrack | grep 169.
tcp      6 86399 ESTABLISHED src=172.17.0.6 dst=169.254.169.254 sport=33114 dport=80 src=169.254.169.254 dst=172.17.0.6 sport=80 dport=33114 [ASSURED] mark=0 use=1

solution (maybe?)

This works for my use-case:

initContainers:
  - name: conntrack
    image: some-conntrack-image:latest
    securityContext:
      capabilities:
        add: ["NET_ADMIN"]
    command: ['conntrack', '-D', 'conntrack', '-d', '169.254.169.254']

We could incorporate this behavior into the upstream image and enable it with a feature flag --drop-conn. Would you like to see this upstream?

moolen avatar Aug 22 '19 09:08 moolen

Thanks so much @moolen for all that, having a think on it!

One other thing we've talked about doing is making a stronger suggestion for operators to install the iptables rules as part of the system initialisation outside of Kiam. I just spoke to @Joseph-Irving and he also said that the team here use https://github.com/uswitch/nidhogg to accomplish the same.

pingles avatar Sep 04 '19 12:09 pingles