kiam icon indicating copy to clipboard operation
kiam copied to clipboard

Pods w/ hostNetwork set to true don't work

Open sporkmonger opened this issue 6 years ago • 6 comments

Took a good while to track this one down, but it seems like setting hostNetwork to true causes the pod to be assigned the IP of the node itself, which presumably prevents kiam from doing what it needs to do. I'm not sure this is something you can fix, but if not, seems like it's worthy of being documented, ideally w/ options for work-around.

sporkmonger avatar Jul 13 '18 16:07 sporkmonger

Good point- at the moment the iptables rules are inserted by matching packets according to their egress network interface (this is a flag on the agent).

Agreed- be good to document.

You can setup your own iptables rules on the nodes to force all traffic to be intercepted- you'd just want to make sure that you didn't need anything running on the host (that wasn't a pod) talking to AWS- otherwise Kiam would intercept and be unable to find a Pod and reject the request.

Could you describe a little more about your use of host networking to help figure out a good work-around please?

pingles avatar Jul 13 '18 18:07 pingles

Sure. We're running Suricata in a DaemonSet for intrusion detection. It needs to listen on the host interface rather than namespaced container network interface because it's there to monitor all the other pods. The ruleset we're loading is an artifact stored in S3. The work-around I have is to simply grant read-only access to the bucket to all nodes/masters, because the rules aren't actually sensitive (sourced from publicly obtainable feeds). Just important that you can't modify them just by virtue of being on a node/master. Obviously that's a work-around that's workload specific though.

sporkmonger avatar Jul 13 '18 20:07 sporkmonger

Interesting. Yeah I'm not sure there's a good answer to this at the moment. You could implement your own IPTables rules to forward metadata API traffic but I'm not sure of the best way to do that.

Maybe someone else could recommend something better?

pingles avatar Jul 16 '18 15:07 pingles

We are currently using calico in policy-engine only mode on our EKS cluster. The calico pods require hostNetwork: true. Are there any suggested work arounds to get these two things to work properly together?

I used this guide (after wrapping in a helm chart) to install calico: https://docs.aws.amazon.com/eks/latest/userguide/calico.html

I have KIAM installed via the helm chart.

TBH - I havent tested what occurs, I just saw this issue and thought I would ask in here...

Is that pods dont work with hostNetwork: true OR that pods that need access to the AWS metadata url will fail?

geota avatar Oct 26 '18 19:10 geota

I just encountered a similar issue to what @geota has described on my EKS.1 cluster. I had to delete my calico networkpolicy in order for kiam-agents to stop crashing due to livenessProbe failures.

Failures logs of kiam-agent:

WARNING: 2018/10/27 03:16:24 grpc: addrConn.createTransport failed to connect to {10.124.174.104:443 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 10.124.174.104:443: i/o timeout". Reconnecting...
WARNING: 2018/10/27 03:16:24 grpc: addrConn.createTransport failed to connect to {10.124.174.104:443 1 3132376531613265.kiam-server.mynamespace.svc.cluster.local. <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 10.124.174.104

Output of kubectl describe:
==========================
 Normal   Started                4m (x3 over 6m)   kubelet, ip-10-124-75-148.ec2.internal  Started container
  Warning  Unhealthy              4m (x7 over 6m)   kubelet, ip-10-124-75-148.ec2.internal  Liveness probe failed: Get http://10.124.75.148:8181/ping: dial tcp 10.124.75.148:8181: getsockopt: connection refused

My calico networkpolicy that I deleted for kiam-agent to stop crashing:

apiVersion: extensions/v1beta1
kind: NetworkPolicy
metadata:
  name: mynamespace-deny-from-other-namespaces
  namespace: mynamespace
spec:
  ingress:
  - from:
    - podSelector: {}
  podSelector: {}
  policyTypes:
  - Ingress

faarshad avatar Oct 27 '18 04:10 faarshad

running

iptables -t nat -A OUTPUT -p tcp -d 169.254.169.254 --dport 80 -j DNAT --to-destination 127.0.0.1:8181

on all the nodes results in:

$curl http://169.254.169.254/latest/meta-data/iam/security-credentials/mytest
error fetching credentials: rpc error: code = Unknown desc = multiple running pods found

guessing the kiam agent can't properly determine what is making the request when everything is on the hostnetwork ?

(sad face)

chrisns avatar Jun 28 '19 17:06 chrisns