opentelemetry-go-contrib icon indicating copy to clipboard operation
opentelemetry-go-contrib copied to clipboard

Fail to obtain resources via EKS detector in EKS cluster

Open XSAM opened this issue 2 years ago • 4 comments

I got an error when trying to obtain resources from the EKS detector in an EKS cluster.

detecting resources: [isEks() error retrieving auth configmap: failed to retrieve ConfigMap kube-system/aws-auth: configmaps "aws-auth" is forbidden: User "system:serviceaccount:foo:default" cannot get resource "configmaps" in API group "" in the namespace "kube-system"]

With this codes:

res, err = resource.New(ctx,
	resource.WithDetectors(eks.NewResourceDetector()),
)

It looks like the default service account in the foo namespace does not have the authentication to access aws-auth configmap in kube-system namespace.

Also, I found EKS detector needs a namespace called amazon-cloudwatch, which is not found in my EKS cluster. Am I missing something?

https://github.com/open-telemetry/opentelemetry-go-contrib/blob/82694badfa2dc81622c716e45bac0999568d035e/detectors/aws/eks/detector.go#L39

XSAM avatar Feb 23 '22 09:02 XSAM

Could this be an IAM issue? https://docs.aws.amazon.com/eks/latest/userguide/troubleshooting_iam.html#security-iam-troubleshoot-cannot-view-nodes-or-workloads

bryan-aguilar avatar Feb 24 '22 15:02 bryan-aguilar

Could this be an IAM issue? https://docs.aws.amazon.com/eks/latest/userguide/troubleshooting_iam.html#security-iam-troubleshoot-cannot-view-nodes-or-workloads

I think it is not an IAM issue. The issue here is the EKS detector requires permission that a default namespace service account does not have.

IIUC, the EKS detector should be used in every application, not the OTel collector, to detect the current cloud provider and k8s.cluster.name. Thus, the k8s permission required by the EKS detector should be out of the box, without further configuring by end-users.

However, by default, a Pod will use a namespace service account called default. This default service account has no permission to access aws-auth configmap of kube-system namespace, which makes the EKS detector fail in an EKS cluster.

Also, as an administrator of an EKS cluster, I do not see a namespace called amazon-cloudwatch, which is required by the detector. That means even I can add enough permissions for the default service account, which is a bad user experience, the EKS detector could still fail.

XSAM avatar Feb 25 '22 02:02 XSAM

I am finding the same thing. This detector is completely useless.

mmclane avatar Jun 24 '22 14:06 mmclane

Could this be an IAM issue? https://docs.aws.amazon.com/eks/latest/userguide/troubleshooting_iam.html#security-iam-troubleshoot-cannot-view-nodes-or-workloads

I think it is not an IAM issue. The issue here is the EKS detector requires permission that a default namespace service account does not have.

IIUC, the EKS detector should be used in every application, not the OTel collector, to detect the current cloud provider and k8s.cluster.name. Thus, the k8s permission required by the EKS detector should be out of the box, without further configuring by end-users.

However, by default, a Pod will use a namespace service account called default. This default service account has no permission to access aws-auth configmap of kube-system namespace, which makes the EKS detector fail in an EKS cluster.

Also, as an administrator of an EKS cluster, I do not see a namespace called amazon-cloudwatch, which is required by the detector. That means even I can add enough permissions for the default service account, which is a bad user experience, the EKS detector could still fail.

Same error.Not only IAM issue.

https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-setup-EKS-quickstart.html

When i create configmap amazon-cloudwatch/cluster-info manual without install containerinsights,and config serviceaccount rbac,it's work.

---
apiVersion: v1
data:
  cluster.name: "CLUSTER_NAME"
  logs.region: "ap-northeast-1"
kind: ConfigMap
metadata:
  name: cluster-info
  namespace: amazon-cloudwatch

I think this code only work on eks cluster installed cloudwatch containerinsights. Why not consider a general solution?

But another problem.getContainerID function doesn't match regexp.

func (eksUtils eksDetectorUtils) getContainerID() (string, error) {
	fileData, err := ioutil.ReadFile(defaultCgroupPath)
	if err != nil {
		return "", fmt.Errorf("getContainerID() error: cannot read file with path %s: %w", defaultCgroupPath, err)
	}

	// is this going to stop working with 1.20 when Docker is deprecated?
	r, err := regexp.Compile(`^.*/docker/(.+)$`)
	if err != nil {
		return "", err
	}

	// Retrieve containerID from file
	splitData := strings.Split(strings.TrimSpace(string(fileData)), "\n")
	for _, str := range splitData {
		if r.MatchString(str) {
			return str[len(str)-containerIDLength:], nil
		}
	}
	return "", fmt.Errorf("getContainerID() error: cannot read containerID from file %s", defaultCgroupPath)
}

Container run on eks,/proc/self/cgroup not contain keyword docker.

kubectl exec nginx-lb-example-5c87bb6c86-7f8kj cat /proc/self/cgroup
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
11:blkio:/kubepods/besteffort/podae395171-6a05-4244-8df0-793504ab5a0d/668f628979f883670575fae1c8d224a3bafc6d940020637f3eab66ee4393de3b
10:pids:/kubepods/besteffort/podae395171-6a05-4244-8df0-793504ab5a0d/668f628979f883670575fae1c8d224a3bafc6d940020637f3eab66ee4393de3b
9:net_cls,net_prio:/kubepods/besteffort/podae395171-6a05-4244-8df0-793504ab5a0d/668f628979f883670575fae1c8d224a3bafc6d940020637f3eab66ee4393de3b
8:freezer:/kubepods/besteffort/podae395171-6a05-4244-8df0-793504ab5a0d/668f628979f883670575fae1c8d224a3bafc6d940020637f3eab66ee4393de3b
7:memory:/kubepods/besteffort/podae395171-6a05-4244-8df0-793504ab5a0d/668f628979f883670575fae1c8d224a3bafc6d940020637f3eab66ee4393de3b
6:hugetlb:/kubepods/besteffort/podae395171-6a05-4244-8df0-793504ab5a0d/668f628979f883670575fae1c8d224a3bafc6d940020637f3eab66ee4393de3b
5:cpuset:/kubepods/besteffort/podae395171-6a05-4244-8df0-793504ab5a0d/668f628979f883670575fae1c8d224a3bafc6d940020637f3eab66ee4393de3b
4:devices:/kubepods/besteffort/podae395171-6a05-4244-8df0-793504ab5a0d/668f628979f883670575fae1c8d224a3bafc6d940020637f3eab66ee4393de3b
3:perf_event:/kubepods/besteffort/podae395171-6a05-4244-8df0-793504ab5a0d/668f628979f883670575fae1c8d224a3bafc6d940020637f3eab66ee4393de3b
2:cpu,cpuacct:/kubepods/besteffort/podae395171-6a05-4244-8df0-793504ab5a0d/668f628979f883670575fae1c8d224a3bafc6d940020637f3eab66ee4393de3b
1:name=systemd:/kubepods/besteffort/podae395171-6a05-4244-8df0-793504ab5a0d/668f628979f883670575fae1c8d224a3bafc6d940020637f3eab66ee4393de3b

xufanglin avatar Aug 14 '22 02:08 xufanglin