opentelemetry-collector-contrib [processor/k8sattributes] Implement `container.image.id` for k8sattributes processor

Description:

Implements #32152

Adds container.image.id to list of Kubernetes attributes detected by k8sattributes processor. Populates this attribute with the value returned from k8s pod status ImageID field.

Link to tracking Issue: #32152

Testing:

Adds unit tests for extracting the k8s ImageID field from container status.
Adds unit tests for including container.image.id in rules for extraction.
Includes container.image.id in the e2e tests for k8sattributes processor.

Documentation: Added container.image.id as an example of attributes that can be extracted from pod/container

Apr 11 '24 07:04 evantorrie

Test failure seems to be unrelated to these changes.

Apr 11 '24 09:04 evantorrie

This PR was marked stale due to lack of activity. It will be closed in 14 days.

Apr 30 '24 05:04 github-actions[bot]

This PR was marked stale due to lack of activity. It will be closed in 14 days.

May 24 '24 05:05 github-actions[bot]

Closed as inactive. Feel free to reopen if this PR is still being worked on.

Jun 07 '24 05:06 github-actions[bot]

I want to reopen this as this is still something we'd like to be able to obtain from the k8sattributes processor in the OpenTelemetry Collector. What we want to do is to be able to uniquely identify the images (that in our case, will always be sourced from an OCI or docker registry) that make up a telemetry emitting pod - i.e. we can't rely just on the image tags, since tags are mutable.

Based on what @ChrsMark said:

This tells me that Kubernetes either uses the image.id or the image.repoDigests[0] of the runtime. I haven't checked its implementation though to verify this, maybe it worth checking that.

I have looked into the k8s code, CRI spec and a couple of container runtimes and believe the following description is accurate for k8s 1.29, although it would be good to have this confirmed by someone more familiar with the k8s API.

The image and imageID fields in the k8s API response from kubectl get pod -o jsonpath '.status.containerStatuses[*].imageID' is sourced by calling the CRI rpc ContainerStatus(ContainerStatusRequest) returns (ContainerStatusResponse).

The CRI ContainerStatusResponse proto includes two fields image (of type ImageSpec) and image_ref (of type string) which are ultimately used to populate the k8s API image and imageID fields. a. image = image.Image b. imageID <= image_ref.

Note: In k8s-1.30 and later, there is an additional image_id field (alongside these image and image_ref fields), but by default it's initialized just to the same as "image_ref".

For containerd's CRI implementation, it appears to be somewhat confused (or at least working based on existing practice from dockershim) about how fields should be populated. See this comment:

// TODO(random-liu): Clean up the following logic in CRI. // Current assumption: // * ImageSpec in container config is image ID. // * ImageSpec in container status is image tag. // * ImageRef in container status is repo digest.

If we look further down in that same function, it attempts to parse the various references using ParseAnyReference from github.com/distribution/reference in containerd's image store, partitioning the references into "digests" (must be of type reference.Canonical) and "tags" (must be of type reference.Tagged). From what I can ascertain, reference.Canonical will be used if the reference contains some sort of digest and a "named repository", while reference.Tagged will be used if there is no digest but there is a named repository. It will always populate the CRI status response "image.Image" field with the first "tag" it finds from tags, and the "image_ref" field with the first "digest" it finds from digests.

The issue is that if there is no appropriate reference.Canonical type returned, then ImageRef will retain its initially set value = container.ImageRef which is sourced from containerd's own internal Container.Metadata.ImageRef set when CreateContainer() is called, and may be just an sha256 digest used by containerd's local image store without any corresponding repo name.

So yes, this explains why @ChrsMark saw the following:

$ kubectl get pods --all-namespaces -o jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .status.containerStatuses[*]}{.imageID}{", "}{end}{end}' |\
sort

coredns-5d78c9869d-29w9c:	sha256:ead0a4a53df89fd173874b46093b6e62d8c72967bbf606d672c9e8c9b601a4fc, 
coredns-5d78c9869d-xkcll:	sha256:ead0a4a53df89fd173874b46093b6e62d8c72967bbf606d672c9e8c9b601a4fc,

For the CRI-O runtime (prior to 1.30), the ContainerStatus image and image_ref fields are populated here. image must be a RegistryImageReference in the CRI-O world, while image_ref is initialized to the someRepoDigest parameter passed to NewContainer(), and which in the function docs says:

// someRepoDigest, if set, is some repo@some digest of the image imageID; it // may have NO RELATIONSHIP to the users’ requested image name (and, which // should be fixed eventually, may be a repo@digest combination which has never // existed on a registry).

As @ChrsMark said:

I believe we should be consistent with the runtime and populate the container.image.repo_digests if the imageID is a repo digest or use the container.image.id if the imageID is not repo digest.

I think one way to proceed is to also use the github.com/distribution/reference package to attempt to parse the output of .imageID from the k8s API response. If it is a canonical reference, then populate the semantic attribute container.image.repo_digests with those values (of which it seems there will only be one from the existing CRI implementations (?)). If it is not, then populate container.image.id.

Jul 04 '24 00:07 evantorrie

Thank's @evantorrie for looking into this! Should we first clarify and fix anything missing (i.e. the inconsistency mentioned at https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/32314#discussion_r1563160705) from the Sem Conv side? Then if the implementation is aligned with Sem Conv we should be fine.

Jul 04 '24 12:07 ChrsMark

@ChrsMark After thinking about this more (and also writing an implementation), I think I'm going to close this PR and start a new one, leaving this one here for reference.

Secondly, yes, I do think it would be good to clarify the comment around container.image.id in the context of Kubernetes. As noted above, the current wording for that semantic attribute includes this phrasing:

K8s defines a link to the container registry repository with digest "imageID": "registry.azurecr.io /namespace/service/dockerfile@sha256:bdeabd40c3a8a492eaf9e8e44d0ebbb84bac7ee25ac0cf8a7159d25f62555625". The ID is assigned by the container runtime and can vary in different environments.

As evidenced by the exploration above, the containerd implementation of the k8s CRI interface populates this imageID field with the first repo digest (note: it only ever returns "one" RepoDigest) it can find in its local image registry store.

If it can't find a repo digest entry (e.g. the image was loaded directly into the image store without being pulled from a remote registry), then the imageID field populated in the k8s response will just be the containerd's local Container.Metadata.ImageRef.

My plan is to create a new feature request to populate container.image.repo_digests - ignoring container.image.id completely. For me, I can't see that much value incontainer.image.id (especially as it currently exists in k8s).

Jul 11 '24 01:07 evantorrie

Makes sense! Thank's @evantorrie for looking into this thoroughly!

I filed https://github.com/open-telemetry/semantic-conventions/issues/1236 to clarify the conventions part.

Jul 11 '24 07:07 ChrsMark

opentelemetry-collector-contrib opentelemetry-collector-contrib copied to clipboard

[processor/k8sattributes] Implement `container.image.id` for k8sattributes processor

opentelemetry-collector-contrib
opentelemetry-collector-contrib copied to clipboard