opentelemetry-collector-contrib icon indicating copy to clipboard operation
opentelemetry-collector-contrib copied to clipboard

k8sattributes doesn't work with both pod UID and container name in sources

Open paulius-valiunas opened this issue 1 year ago • 15 comments

Component(s)

processor/k8sattributes

What happened?

Description

If I provide pod UID, container name, and namespace name, to the k8sattributes processor, it doesn't seem to work.

Steps to Reproduce

  k8sattributes/filelog:
    pod_association:
      - sources:
        - from: resource_attribute
          name: k8s.pod.uid
        - from: resource_attribute
          name: k8s.container.name
        - from: resource_attribute
          name: k8s.namespace.name
    extract:
      metadata:
        - k8s.node.name
        - k8s.pod.uid
        - k8s.container.name
        - k8s.namespace.name

Expected Result

Should add k8s.node.name to the resource

Actual Result

Doesn't add anything. Doesn't log anything useful either. With debug logs enabled:

2024-03-29T16:27:32.474Z	debug	[email protected]/processor.go:108	evaluating pod identifier	{"kind": "processor", "name": "k8sattributes/filelog", "pipeline": "logs", "value": [{"Source":{"From":"resource_attribute","Name":"k8s.pod.uid"},"Value":"814b693a-dbc8-44aa-b7ad-22f919e5569c"},{"Source":{"From":"resource_attribute","Name":"k8s.container.name"},"Value":"pod-619863854748671e213bb333e56981de795eeeb0"},{"Source":{"From":"resource_attribute","Name":"k8s.namespace.name"},"Value":"imodeljs"},{"Source":{"From":"","Name":""},"Value":""}]}

After removing k8s.container.name from sources, the processor started adding the node name to my resources.

Collector version

OpenTelemetry Operator 0.42.3

Environment information

Environment

OS: (e.g., "Ubuntu 20.04") Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

No response

Log output

No response

Additional context

No response

paulius-valiunas avatar Apr 02 '24 13:04 paulius-valiunas

Pinging code owners:

  • processor/k8sattributes: @dmitryax @rmfitzpatrick @fatsheep9146 @TylerHelmuth

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Apr 02 '24 14:04 github-actions[bot]

I think this is because k8s.container.name is not actually added to pod attributes that association can check. If so, this should either be documented, or ideally should result in an error.

swiatekm avatar Apr 02 '24 14:04 swiatekm

Here's an excerpt from the documentation:

Additional container level attributes can be extracted provided that certain resource attributes are provided:

  1. If the container.id resource attribute is provided, the following additional attributes will be available:
    • k8s.container.name
    • container.image.name
    • container.image.tag
  2. If the k8s.container.name resource attribute is provided, the following additional attributes will be available:
    • container.image.name
    • container.image.tag
  3. If the k8s.container.restart_count resource attribute is provided, it can be used to associate with a particular container instance. If it's not set, the latest container instance will be used:
    • container.id (not added by default, has to be specified in metadata)

This led me to believe that k8s.container.name is supported, and that I can add it in addition to the other attributes. Sounds like that's incorrect and the README should be updated...

paulius-valiunas avatar Apr 03 '24 11:04 paulius-valiunas

@paulius-valiunas condition 1. states that k8s.container.name can be extracted as long as the container.id resource attribute is already present. Does your data have the resource attribute container.id?

This feels like a possible duplicate of https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31509, do you agree?

TylerHelmuth avatar Apr 04 '24 20:04 TylerHelmuth

If you're trying to associate the telemetry to pod data using k8s.container.name it must be present in the resource attributes already - you cannot use the processor to add k8s.container.name and use it for association at the same time.

TylerHelmuth avatar Apr 04 '24 20:04 TylerHelmuth

@TylerHelmuth I don't think this works either way. If you look at the implementation of Pod cache updates, it goes like so:

  1. Extract Pod attributes (but not container attributes) and put them in the Attributes field.
  2. Extract container data (but not attributes, all the data) and put it in the Containers field.
  3. Determine Pod identity based on association configuration. This can use arbitrary attributes, but only ones in the Attributes field.
  4. Save Pod data under all the determined identities.

Container attributes are not calculated during this process, but instead at runtime based on the Containers field: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/c38ec5cab617bd7a83b82d66be69c62019da6987/processor/k8sattributesprocessor/processor.go#L147.

swiatekm avatar Apr 05 '24 09:04 swiatekm

@paulius-valiunas condition 1. states that k8s.container.name can be extracted as long as the container.id resource attribute is already present. Does your data have the resource attribute container.id?

This feels like a possible duplicate of #31509, do you agree?

I'm not trying to extract the container name, I already have it. I'm just trying to get the image name and tag.

paulius-valiunas avatar Apr 05 '24 09:04 paulius-valiunas

Also, the documentation says:

Only attribute names from metadata should be used for pod_association's resource_attribute, because empty or non-existing values will be ignored.

That's why I added the "already known" fields to both metadata and pod_association. Seems like this sentence is incorrect or in need of clarification, because I later removed everything except node name from metadata and it still worked.

paulius-valiunas avatar Apr 05 '24 09:04 paulius-valiunas

@swiatekm-sumo you're right. We should add to the README that container fields can't be used for association.

Seems like this sentence is incorrect or in need of clarification, because I later removed everything except node name from metadata and it still worked.

Yes I agree, I think that sentence is wrong and could be removed.

I'm not trying to extract the container name, I already have it. I'm just trying to get the image name and tag.

Similar to https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31509#issuecomment-1999690606, can you check that the container name on the telemetry matches the container name the processor is recording? If they don't match the other container details cannot be extracted.

I also suggest you make the pod_association section look like this:

    pod_association:
    - sources:
      - from: resource_attribute
        name: k8s.pod.ip
    - sources:
      - from: resource_attribute
        name: k8s.pod.uid
    - sources:
      - from: connection

As @swiatekm-sumo pointed out k8s.container.name won't work as an association and k8s.namespace.name isn't unique to a pod.

TylerHelmuth avatar Apr 05 '24 16:04 TylerHelmuth

We should add to the README that container fields can't be used for association.

In that case, you should remove this section too:

  1. If the k8s.container.name resource attribute is provided, the following additional attributes will be available:
    • container.image.name
    • container.image.tag

If I may ask, how could you possibly get container.image.name using this processor?

Similar to #31509 (comment), can you check that the container name on the telemetry matches the container name the processor is recording? If they don't match the other container details cannot be extracted.

Yes, I'm getting the correct container name, everything comes from Kubernetes log files.

I also suggest you make the pod_association section look like this:

    pod_association:
    - sources:
      - from: resource_attribute
        name: k8s.pod.ip
    - sources:
      - from: resource_attribute
        name: k8s.pod.uid
    - sources:
      - from: connection

I'm getting all my logs from a filelog receiver so the connection/IP-related sources won't work.

As @swiatekm-sumo pointed out k8s.container.name won't work as an association and k8s.namespace.name isn't unique to a pod.

You have a point, k8s.namespace.name isn't unique to a pod, it's only there because I was previously using it in combination with k8s.pod.name. Only the combination of both is unique. I later replaced k8s.pod.name with k8s.pod.uid but didn't remove the namespace.

paulius-valiunas avatar Apr 08 '24 09:04 paulius-valiunas

@paulius-valiunas while container fields do nothing in the pod_association section, they have meaning in the extract.metadata section.

This section of the README is accurate:

Additional container level attributes can be extracted provided that certain resource attributes are provided:

1. If the `container.id` resource attribute is provided, the following additional attributes will be available:
   - k8s.container.name
   - container.image.name
   - container.image.tag
2. If the `k8s.container.name` resource attribute is provided, the following additional attributes will be available:
   - container.image.name
   - container.image.tag
3. If the `k8s.container.restart_count` resource attribute is provided, it can be used to associate with a particular container
   instance. If it's not set, the latest container instance will be used:
   - container.id (not added by default, has to be specified in `metadata`)

Although I think we should update the word provided to something clearer. The README is trying to convey that the listed attributes, when configured in extract.metadata, can only be extracted when the named resource attribute is present in the telemetry's resource.

In your situation, you''ll be able to associate the pod logs you're collecting via the filelog receiver to the pod metadata via k8s.pod.uid. If you're using the Collector Helm Chart logsCollection preset, the filelogreceiver configuration will extract this resource attribute for you: https://github.com/open-telemetry/opentelemetry-helm-charts/blob/abe4c7dc5a5f384d5f847a0aa7bfeb12dca9ef90/charts/opentelemetry-collector/templates/_config.tpl#L276-L278 (as well as the k8s.container.name.

Your k8sattributesprocessor config can look like this:

  k8sattributes/filelog:
    pod_association:
      - sources:
        - from: resource_attribute
          name: k8s.pod.uid
    extract:
      metadata:
        - "container.image.name"
        - "container.image.tag"

TylerHelmuth avatar Apr 10 '24 21:04 TylerHelmuth

Yes, the documentation needs to be a lot clearer. What exactly does "resource attributes are provided" mean? Do they have to be in the resource but not referenced by pod_association?

In addition, I don't think "[they] do nothing in the pod_association section" is accurate - if you include them, the association doesn't work anymore, so they do something, just not what you'd expect.

Overall, some things might be hard to explain in words, but it would be great to have more examples in the documentation.

paulius-valiunas avatar Apr 11 '24 07:04 paulius-valiunas

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

  • processor/k8sattributes: @dmitryax @rmfitzpatrick @fatsheep9146 @TylerHelmuth

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Jun 11 '24 03:06 github-actions[bot]

@TylerHelmuth if this issue is still relevant I would be happy to look into this

bacherfl avatar Jun 27 '24 05:06 bacherfl

I have created a PR for the readme in an attempt to make things a bit clearer and highlight potential limitations for association rules: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/33800 please let me know if there is something you think needs further elaboration

bacherfl avatar Jun 28 '24 05:06 bacherfl

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

  • processor/k8sattributes: @dmitryax @rmfitzpatrick @fatsheep9146 @TylerHelmuth

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Aug 28 '24 03:08 github-actions[bot]

This issue has been closed as inactive because it has been stale for 120 days with no activity.

github-actions[bot] avatar Oct 27 '24 05:10 github-actions[bot]