datadog-agent icon indicating copy to clipboard operation
datadog-agent copied to clipboard

Read `DD_GIT_REPOSITORY_URL` and `DD_GIT_COMMIT_SHA` env vars

Open L3n41c opened this issue 1 year ago • 2 comments

What does this PR do?

Make the agent read the containers DD_GIT_REPOSITORY_URL and DD_GIT_COMMIT_SHA environment variables to populate the git.repository_url and git.commit.sha tags.

Motivation

Those git.repository_url and git.commit.sha tags are currently filled from the values of the org.opencontainers.image.source and org.opencontainers.image.revision docker labels. But those labels are not used by the APM tracer libraries. In order to ease the setup for the customers, it would be simpler to have one way to configure the repository URL and the commit sha that would work for both the agent and the APM tracer libraries.

Additional Notes

Possible Drawbacks / Trade-offs

This works only for the env vars set at the container level. If the env var is set by a script running inside the container, it won’t be seen by the agent.

Describe how to test/QA your changes

Start a container with the repository URL and commit sha set as docker labels DD_GIT_REPOSITORY_URL and DD_GIT_COMMIT_SHA. Check that all the data coming from this container are properly tagged with git.repository_url and git.commit.sha.

L3n41c avatar Jul 03 '24 13:07 L3n41c

Test changes on VM

Use this command from test-infra-definitions to manually test this PR changes on a VM:

inv create-vm --pipeline-id=38365180 --os-family=ubuntu

Note: This applies to commit fa87bc20

pr-commenter[bot] avatar Jul 03 '24 14:07 pr-commenter[bot]

Regression Detector

Regression Detector Results

Run ID: 1598518f-751c-432e-9650-7cc4a9e1f2a2 Metrics dashboard Target profiles

Baseline: 51437cf664fcf8cc90b006b2e21e08de49a4f4e0 Comparison: fa87bc20591d35e96e469bcd7adea93cd8dd4861

Performance changes are noted in the perf column of each table:

  • ✅ = significantly better comparison variant performance
  • ❌ = significantly worse comparison variant performance
  • ➖ = no significant change in performance

No significant changes in experiment optimization goals

Confidence level: 90.00% Effect size tolerance: |Δ mean %| ≥ 5.00%

There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.

Fine details of change detection per experiment

perf experiment goal Δ mean % Δ mean % CI links
pycheck_1000_100byte_tags % cpu utilization +1.40 [-3.53, +6.32] Logs
uds_dogstatsd_to_api_cpu % cpu utilization +0.59 [-0.29, +1.47] Logs
uds_dogstatsd_to_api ingress throughput +0.00 [-0.00, +0.00] Logs
tcp_dd_logs_filter_exclude ingress throughput -0.00 [-0.01, +0.01] Logs
idle memory utilization -0.08 [-0.12, -0.04] Logs
file_tree memory utilization -0.14 [-0.26, -0.02] Logs
basic_py_check % cpu utilization -0.24 [-2.97, +2.49] Logs
otel_to_otel_logs ingress throughput -0.26 [-1.07, +0.55] Logs
tcp_syslog_to_blackhole ingress throughput -3.20 [-15.75, +9.35] Logs

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

  1. Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.

  2. Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.

  3. Its configuration does not mark it "erratic".

pr-commenter[bot] avatar Jul 03 '24 14:07 pr-commenter[bot]

I'm not sure about the feature.

@clamoriniere Hi, so the goal is to be able to tag events (like logs) with the repo. This would allow to search logs by repository which would be useful for IDE plug-ins where association services with code is not easily doable.

Also this would unify (and simplify) the configuration: Source Code Integration already needs this environment variable and agent was using OCI labels, now consumers will only have to set these environment variables.

bric3 avatar Jul 04 '24 08:07 bric3

@bric3 Although I can understand the value brought by having all the telemetry data tagged by git.repository_url and git.commit.sha, I share the same concerns as @clamoriniere regarding the use of environment variables for that purpose.

Today, the agent is already automatically tagging with git.repository_url and git.commit.sha all the data coming from containers the image of which are labeled with org.opencontainers.image.source and org.opencontainers.image.revision: https://github.com/DataDog/datadog-agent/blob/6e355775b3d0cb61b371c3600a55b0ed755e35ee/comp/core/tagger/taggerimpl/collectors/workloadmeta_extract.go#L104-L108

That’s the mechanism that has been leveraged in #26156 to ensure that data coming from the datadog agent itself are properly tagged with the source code integration required tags.

Those docker labels are not datadog specific. They are Open Container standard. Here is the spec. The advantage of leveraging those standard labels is that they are potentially known and already used by many other tools. For ex. people building their docker images with GitHub actions will get those labels without even noticing it thanks to the docker/metadata-action GitHub action.

On the other hand, using DD_GIT_REPOSITORY_URL and DD_GIT_COMMIT_SHA environment variables isn’t standard and so, will necessarily require specific extra setup for the users.

It can easily be broken because there’s no guarantee that the daemon process running inside a container is having exactly the same environment variables as the ones defined in the ENV statements of the Dockerfile. There can be an entrypoint (something like S6) that doesn’t pass all the environment variables or that sets some extra ones.

Anyway, the “repository URL” and the “commit SHA” must be immutable properties of the docker image itself and not a parameter that be decided at run time with a spec.containers.*.env setting in a Kubernetes pod manifest or a docker run --env … parameter.

I feel that the main argument in favor of using environment variables seems to be consistent with what we have with the APM tracer libraries because APM tracer libraries are unable to read docker or Kubernetes labels.

I suspected that APM tracer libraries were reading the environment variables to set the proper tags. But they shouldn’t. Tags should be added by the trace-agent thanks to origin detection. So, if tags are properly set by the trace-agent, APM libraries don’t need to set tags. If they don’t need to set tags, I’m not sure why they would require the DD_GIT_REPOSITORY_URL and DD_GIT_COMMIT_SHA environment variables.

L3n41c avatar Jul 04 '24 14:07 L3n41c

/merge

L3n41c avatar Jul 05 '24 14:07 L3n41c

:steam_locomotive: MergeQueue: pull request added to the queue

The median merge time in main is 24m.

Use /merge -c to cancel this operation!

dd-devflow[bot] avatar Jul 05 '24 14:07 dd-devflow[bot]