opentelemetry-collector-contrib
opentelemetry-collector-contrib copied to clipboard
[processor/k8sattributes] Implement `container.image.id` for k8sattributes processor
Description:
Implements #32152
Adds container.image.id
to list of Kubernetes attributes detected by k8sattributes
processor.
Populates this attribute with the value returned from k8s pod status ImageID field.
Link to tracking Issue: #32152
Testing:
- Adds unit tests for extracting the k8s
ImageID
field from container status. - Adds unit tests for including
container.image.id
in rules for extraction. - Includes
container.image.id
in the e2e tests for k8sattributes processor.
Documentation: Added container.image.id
as an example of attributes that can be extracted from pod/container
Test failure seems to be unrelated to these changes.
This PR was marked stale due to lack of activity. It will be closed in 14 days.
This PR was marked stale due to lack of activity. It will be closed in 14 days.
Closed as inactive. Feel free to reopen if this PR is still being worked on.
I want to reopen this as this is still something we'd like to be able to obtain from the k8sattributes processor in the OpenTelemetry Collector. What we want to do is to be able to uniquely identify the images (that in our case, will always be sourced from an OCI or docker registry) that make up a telemetry emitting pod - i.e. we can't rely just on the image tags, since tags are mutable.
Based on what @ChrsMark said:
This tells me that Kubernetes either uses the image.id or the image.repoDigests[0] of the runtime. I haven't checked its implementation though to verify this, maybe it worth checking that.
I have looked into the k8s code, CRI spec and a couple of container runtimes and believe the following description is accurate for k8s 1.29, although it would be good to have this confirmed by someone more familiar with the k8s API.
-
The
image
andimageID
fields in the k8s API response fromkubectl get pod -o jsonpath '.status.containerStatuses[*].imageID'
is sourced by calling the CRIrpc ContainerStatus(ContainerStatusRequest) returns (ContainerStatusResponse)
.The CRI
ContainerStatusResponse
proto includes two fieldsimage
(of type ImageSpec) andimage_ref
(of type string) which are ultimately used to populate the k8s API image and imageID fields. a.image
=image.Image
b.imageID
<=image_ref
.
Note: In k8s-1.30 and later, there is an additional image_id
field (alongside these image and image_ref fields), but by default it's initialized just to the same as "image_ref".
-
For containerd's CRI implementation, it appears to be somewhat confused (or at least working based on existing practice from dockershim) about how fields should be populated. See this comment:
// TODO(random-liu): Clean up the following logic in CRI. // Current assumption: // * ImageSpec in container config is image ID. // * ImageSpec in container status is image tag. // * ImageRef in container status is repo digest.
If we look further down in that same function, it attempts to parse the various references using
ParseAnyReference
from github.com/distribution/reference in containerd's image store, partitioning the references into "digests" (must be of typereference.Canonical
) and "tags" (must be of typereference.Tagged
). From what I can ascertain,reference.Canonical
will be used if the reference contains some sort of digest and a "named repository", whilereference.Tagged
will be used if there is no digest but there is a named repository. It will always populate the CRI status response "image.Image" field with the first "tag" it finds from tags, and the "image_ref" field with the first "digest" it finds from digests.The issue is that if there is no appropriate
reference.Canonical
type returned, then ImageRef will retain its initially set value =container.ImageRef
which is sourced from containerd's own internalContainer.Metadata.ImageRef
set whenCreateContainer()
is called, and may be just an sha256 digest used by containerd's local image store without any corresponding repo name.So yes, this explains why @ChrsMark saw the following:
$ kubectl get pods --all-namespaces -o jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .status.containerStatuses[*]}{.imageID}{", "}{end}{end}' |\
sort
coredns-5d78c9869d-29w9c: sha256:ead0a4a53df89fd173874b46093b6e62d8c72967bbf606d672c9e8c9b601a4fc,
coredns-5d78c9869d-xkcll: sha256:ead0a4a53df89fd173874b46093b6e62d8c72967bbf606d672c9e8c9b601a4fc,
-
For the CRI-O runtime (prior to 1.30), the ContainerStatus
image
andimage_ref
fields are populated here.image
must be a RegistryImageReference in the CRI-O world, whileimage_ref
is initialized to thesomeRepoDigest
parameter passed toNewContainer()
, and which in the function docs says:// someRepoDigest, if set, is some repo@some digest of the image imageID; it // may have NO RELATIONSHIP to the users’ requested image name (and, which // should be fixed eventually, may be a repo@digest combination which has never // existed on a registry).
As @ChrsMark said:
I believe we should be consistent with the runtime and populate the container.image.repo_digests if the imageID is a repo digest or use the container.image.id if the imageID is not repo digest.
I think one way to proceed is to also use the github.com/distribution/reference package to attempt to parse the output of .imageID
from the k8s API response. If it is a canonical reference, then populate the semantic attribute container.image.repo_digests
with those values (of which it seems there will only be one from the existing CRI implementations (?)). If it is not, then populate container.image.id
.
Thank's @evantorrie for looking into this! Should we first clarify and fix anything missing (i.e. the inconsistency mentioned at https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/32314#discussion_r1563160705) from the Sem Conv side? Then if the implementation is aligned with Sem Conv we should be fine.
@ChrsMark After thinking about this more (and also writing an implementation), I think I'm going to close this PR and start a new one, leaving this one here for reference.
Secondly, yes, I do think it would be good to clarify the comment around container.image.id
in the context of Kubernetes. As noted above, the current wording for that semantic attribute includes this phrasing:
K8s defines a link to the container registry repository with digest
"imageID": "registry.azurecr.io /namespace/service/dockerfile@sha256:bdeabd40c3a8a492eaf9e8e44d0ebbb84bac7ee25ac0cf8a7159d25f62555625"
. The ID is assigned by the container runtime and can vary in different environments.
As evidenced by the exploration above, the containerd implementation of the k8s CRI interface populates this imageID
field with the first repo digest (note: it only ever returns "one" RepoDigest) it can find in its local image registry store.
If it can't find a repo digest entry (e.g. the image was loaded directly into the image store without being pulled from a remote registry), then the imageID field populated in the k8s response will just be the containerd's local Container.Metadata.ImageRef.
My plan is to create a new feature request to populate container.image.repo_digests
- ignoring container.image.id
completely. For me, I can't see that much value incontainer.image.id
(especially as it currently exists in k8s).
Makes sense! Thank's @evantorrie for looking into this thoroughly!
I filed https://github.com/open-telemetry/semantic-conventions/issues/1236 to clarify the conventions part.