transport: loopyWriter.run returning. connection error: desc = "transport is closing"
kiam not work well... k8s version "v1.14.6" kiam deployed in the external openstack cloud (for get keys in kms from aws) kiam version v3.5
INFO: 2020/05/05 17:53:47 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/05 17:53:48 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/05 17:53:57 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/05 17:53:58 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/05 17:54:07 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/05 17:54:08 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/05 17:54:17 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/05 17:54:18 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/05 17:54:27 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/05 17:54:28 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
WARNING: 2020/05/05 17:54:37 transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:443->127.0.0.1:59896: read: connection reset by peer
INFO: 2020/05/05 17:54:37 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/05 17:54:38 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
in pod -
wget -qO- http://169.254.169.254
1.0
2007-01-19
2007-03-01
2007-08-29
2007-10-10
2007-12-15
2008-02-01
2008-09-01
2009-04-04
deployed parameters
argocd app create kiam \
--repo https://uswitch.github.io/kiam-helm-charts/charts/ \
--helm-chart kiam \
--revision 5.7.0 \
--dest-namespace base \
--dest-server https://10.242.20.10:6443 \
--helm-set-string "server.extraEnv[0].name=AWS_ACCESS_KEY_ID" \
--helm-set-string "server.extraEnv[0].value=Axxxxxxxxxx" \
--helm-set-string "server.extraEnv[1].name=AWS_SECRET_ACCESS_KEY" \
--helm-set-string "server.extraEnv[1].value=Zzxxxxxxxxxxxxxxxxxxxxxxxxxx" \
--helm-set-string "server.extraEnv[2].name=GRPC_GO_LOG_SEVERITY_LEVEL" \
--helm-set-string "server.extraEnv[2].value=info" \
--helm-set-string "server.extraEnv[3].name=GRPC_GO_LOG_VERBOSITY_LEVEL" \
--helm-set-string "server.extraEnv[3].value=8" \
--helm-set-string "extraHostPathMounts[0].name=ssl-certs" \
--helm-set-string "extraHostPathMounts[0].mountPath=/etc/ssl/certs" \
--helm-set-string "extraHostPathMounts[0].readOnly=true" \
--helm-set-string "extraHostPathMounts[0].hostPath=/etc/pki/ca-trust/extracted/pem" \
-p agent.log.level=debug \
-p server.log.level=debug \
-p server.sslCertHostPath=/etc/ssl/certs \
-p agent.tlsSecret=kiam-agent-certificate-secret \
-p agent.tlsCerts.caFileName=ca.crt \
-p agent.tlsCerts.certFileName=tls.crt \
-p agent.tlsCerts.keyFileName=tls.key \
-p server.assumeRoleArn=arn:aws:iam::481746587383:role/kiam-server \
-p server.tlsSecret=kiam-server-certificate-secret \
-p server.tlsCerts.caFileName=ca.crt \
-p server.tlsCerts.certFileName=tls.crt \
-p server.tlsCerts.keyFileName=tls.key \
-p server.roleBaseArn=arn:aws:iam::481746587383:role/
maybe anyone can help to me?
also i created roles by TF
resource "aws_iam_role" "server_role" {
name = "kiam-server"
description = "Role the Kiam Server process assumes"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::481746587383:user/kiam"
},
"Action": "sts:AssumeRole"
}
]
}
EOF
}
resource "aws_iam_policy" "server_policy" {
name = "kiam_server_policy"
description = "Policy for the Kiam Server process"
policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"sts:AssumeRole"
],
"Resource": "*"
}
]
}
EOF
}
resource "aws_iam_policy_attachment" "server_policy_attach" {
name = "kiam-server-attachment"
roles = ["${aws_iam_role.server_role.name}"]
policy_arn = "${aws_iam_policy.server_policy.arn}"
}
hm ... i found that wget -qO- http://169.254.169.254 geting info of internal openstack api...
UPD i found problem
"error warming credentials: RequestError: send request failed\ncaused by: Post https://sts.amazonaws.com/: x509: certificate signed by unknown authority
and solve
it changed path
/ # ls /etc/ssl/certs/
README email-ca-bundle.pem objsign-ca-bundle.pem tls-ca-bundle.pem
and
root@DESKTOP-FFV0RBI:~/rd_argo/argo/proj/base# kubectl exec -it -n base kiam-server-8vzrt /bin/sh
/ # cat /etc/ssl/certs/* | grep zon
# Amazon Root CA 1
# Amazon Root CA 2
# Amazon Root CA 3
# Amazon Root CA 4
# Amazon Root CA 1
# Amazon Root CA 2
# Amazon Root CA 3
# Amazon Root CA 4
also in servers log i see
{"credentials.access.key":"ASIXXXXXXXXXXXXXXX","credentials.expiration":"2020-05-06T17:01:55Z","credentials.role":"kiam-server","level":"info","msg":"requested new credentials","time":"2020-05-06T16:46:55Z"}
{"credentials.access.key":"ASIXXXXXXXXXXXXXXX","credentials.expiration":"2020-05-06T17:01:55Z","credentials.role":"kiam-server","generation.metadata":0,"level":"info","msg":"fetched credentials","pod.iam.role":"kiam-server","pod.name":"aws-iam-tester-7f74788df5-frpkm","pod.namespace":"default","pod.status.ip":"10.242.32.148","pod.status.phase":"Running","resource.version":"18607042","time":"2020-05-06T16:46:55Z"}
that they can get key
but the problem of
INFO: 2020/05/06 16:51:56 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
and
WARNING: 2020/05/06 16:55:06 transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:443->127.0.0.1:53962: read: connection reset by peer
problem is saving
I found that https://github.com/uswitch/kiam/issues/385 talked that still need access to host ec2 api.... Does this mean that Kiam can only work inside AWS?
Kiam does not need to be ran in AWS What is the actual problem you're seeing? you've posted some info and warning logs, but those don't necessarily indicate any actual problems. Are your Kiam servers running and passing their health checks? Are you seeing any error messages in the Kiam server? Are your Kiam agents running and passing their health checks? Are they reporting any errors in their logs?
Kiam does not need to be ran in AWS What is the actual problem you're seeing? you've posted some info and warning logs, but those don't necessarily indicate any actual problems. Are your Kiam servers running and passing their health checks? Are you seeing any error messages in the Kiam server? Are your Kiam agents running and passing their health checks? Are they reporting any errors in their logs?
no agents is dead
{"level":"info","msg":"configuring iptables","time":"2020-05-07T11:01:05Z"}
{"level":"info","msg":"started prometheus metric listener 0.0.0.0:9620","time":"2020-05-07T11:01:05Z"}
{"level":"info","msg":"listening :8181","time":"2020-05-07T11:01:05Z"}
{"level":"info","msg":"stopped","time":"2020-05-07T11:01:14Z"}
{"level":"info","msg":"starting server shutdown","time":"2020-05-07T11:01:14Z"}
{"level":"info","msg":"gracefully shutdown server","time":"2020-05-07T11:01:14Z"}
and in servers i have logs
INFO: 2020/05/07 11:05:31 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/07 11:05:36 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
WARNING: 2020/05/07 11:05:41 transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:443->127.0.0.1:48828: read: connection reset by peer
INFO: 2020/05/07 11:05:41 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
{"generation.metadata":0,"level":"debug","msg":"updated pod","pod.iam.role":"","pod.name":"kiam-agent-nfjln","pod.namespace":"base","pod.status.ip":"10.242.20.17","pod.status.phase":"Running","resource.version":"18911803","time":"2020-05-07T11:05:43Z"}
INFO: 2020/05/07 11:05:46 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/07 11:05:51 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/07 11:05:51 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
{"generation.metadata":0,"level":"debug","msg":"updated pod","pod.iam.role":"","pod.name":"kiam-agent-nfjln","pod.namespace":"base","pod.status.ip":"10.242.20.17","pod.status.phase":"Running","resource.version":"18911851","time":"2020-05-07T11:05:52Z"}
INFO: 2020/05/07 11:05:56 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/07 11:06:01 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/07 11:06:03 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
{"generation.metadata":0,"level":"debug","msg":"updated pod","pod.iam.role":"","pod.name":"kiam-agent-nfjln","pod.namespace":"base","pod.status.ip":"10.242.20.17","pod.status.phase":"Running","resource.version":"18911902","time":"2020-05-07T11:06:04Z"}
INFO: 2020/05/07 11:06:06 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/07 11:06:11 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
INFO: 2020/05/07 11:06:16 transport: loopyWriter.run returning. connection error: desc = "transport is closing"
Something is killing your Kiam agents, as they're shutting down. Are the failing their liveness probe? If so you should look into why that's happening
Something is killing your Kiam agents, as they're shutting down. Are the failing their liveness probe? If so you should look into why that's happening
how i can do it?) log level is debug
kubectl describe pod pod-name
kubectl describe pod pod-name
Name: kiam-agent-nfjln
Namespace: base
Node: mom-gatekeeper-argo-0-default-group-0/10.242.20.17
Start Time: Thu, 07 May 2020 00:33:25 +0300
Labels: app=kiam
component=agent
controller-revision-hash=66bb99d55
pod-template-generation=1
release=kiam
Annotations: <none>
Status: Running
IP: 10.242.20.17
IPs: <none>
Controlled By: DaemonSet/kiam-agent
Containers:
kiam-agent:
Container ID: docker://e3dc53b60fc1ad1e579d7e909e0f7fa44467e4f60e037b7795c7bae7b6b615f5
Image: quay.io/uswitch/kiam:v3.5
Image ID: docker-pullable://quay.io/uswitch/kiam@sha256:923020c93162636af89a54f4e96e062341c6ef87b85a6567d1cf0edb7fbff33c
Port: <none>
Host Port: <none>
Command:
/kiam
agent
Args:
--iptables
--no-iptables-remove
--host-interface=cali+
--json-log
--level=debug
--port=8181
--cert=/etc/kiam/tls/tls.crt
--key=/etc/kiam/tls/tls.key
--ca=/etc/kiam/tls/ca.crt
--server-address=kiam-server:443
--prometheus-listen-addr=0.0.0.0:9620
--prometheus-sync-interval=5s
--gateway-timeout-creation=1s
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 11 May 2020 21:45:43 +0300
Finished: Mon, 11 May 2020 21:45:54 +0300
Ready: False
Restart Count: 2574
Liveness: http-get http://:8181/ping delay=3s timeout=1s period=3s #success=1 #failure=3
Environment:
HOST_IP: (v1:status.podIP)
Mounts:
/etc/kiam/tls from tls (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kiam-agent-token-nwjms (ro)
/var/run/xtables.lock from xtables (rw)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
tls:
Type: Secret (a volume populated by a Secret)
SecretName: kiam-agent-certificate-secret
Optional: false
xtables:
Type: HostPath (bare host directory volume)
Path: /run/xtables.lock
HostPathType: FileOrCreate
kiam-agent-token-nwjms:
Type: Secret (a volume populated by a Secret)
SecretName: kiam-agent-token-nwjms
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/network-unavailable:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/pid-pressure:NoSchedule
node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unschedulable:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 25m (x7693 over 4d21h) kubelet, mom-gatekeeper-argo-0-default-group-0 Liveness probe failed: HTTP probe failed with statuscode: 404
Warning BackOff 49s (x31472 over 4d21h) kubelet, mom-gatekeeper-argo-0-default-group-0 Back-off restarting failed container
i updated node selector and toleration for set kiam-server executing on masters nodes and agents on others nodes. But i have these errors anyway.
You're getting a Liveness probe failed: HTTP probe failed with statuscode: 404 which is rather odd, do you have something else on those nodes that are listening on port 8181?