opentelemetry-java
opentelemetry-java copied to clipboard
Improve detection of resource attributes on ECS
This improves the detection of resource attributes on ECS by fetching ECS metadata from ECS_CONTAINER_METADATA_URI or ECS_CONTAINER_METADATA_URI_V4.
Previously only CONTAINER_NAME and CONTAINER_ID id were set.
Now we set:
- CONTAINER_ID
- CONTAINER_NAME
- AWS_ECS_CONTAINER_ARN
- CONTAINER_IMAGE_NAME
- CONTAINER_IMAGE_TAG
- aws.ecs.container.image.id
- AWS_LOG_GROUP_ARNS
- AWS_LOG_GROUP_NAMES
- AWS_LOG_STREAM_NAMES
- AWS_ECS_TASK_ARN
- AWS_ECS_TASK_FAMILY
- AWS_ECS_TASK_REVISION
Especially AWS_LOG_GROUP_ARNS is important so that connection of traces to logs works OOTB on X-Ray.
Codecov Report
Base: 90.04% // Head: 90.60% // Increases project coverage by +0.56% :tada:
Coverage data is based on head (
1fc24b4) compared to base (0860b38). Patch has no changes to coverable lines.
Additional details and impacted files
@@ Coverage Diff @@
## main #4574 +/- ##
============================================
+ Coverage 90.04% 90.60% +0.56%
+ Complexity 5057 4906 -151
============================================
Files 582 567 -15
Lines 15580 14728 -852
Branches 1495 1414 -81
============================================
- Hits 14029 13345 -684
+ Misses 1096 959 -137
+ Partials 455 424 -31
| Impacted Files | Coverage Δ | |
|---|---|---|
| ...java/io/opentelemetry/sdk/trace/data/SpanData.java | 75.00% <0.00%> (-25.00%) |
:arrow_down: |
| ...esting/context/SettableContextStorageProvider.java | 37.93% <0.00%> (-24.14%) |
:arrow_down: |
| ...a/io/opentelemetry/sdk/trace/internal/JcTools.java | 28.57% <0.00%> (-4.77%) |
:arrow_down: |
| ...ernal/aggregator/ExplicitBucketHistogramUtils.java | 97.29% <0.00%> (-2.71%) |
:arrow_down: |
| .../opentelemetry/sdk/internal/ComponentRegistry.java | 90.90% <0.00%> (-2.43%) |
:arrow_down: |
| ...o/opentelemetry/api/internal/ReadOnlyArrayMap.java | 91.11% <0.00%> (-2.23%) |
:arrow_down: |
| ...opentelemetry/opentracingshim/SpanBuilderShim.java | 77.89% <0.00%> (-0.68%) |
:arrow_down: |
| ...n/java/io/opentelemetry/sdk/logs/LogProcessor.java | 85.71% <0.00%> (ø) |
|
| .../java/io/opentelemetry/sdk/logs/SdkLogEmitter.java | 100.00% <0.00%> (ø) |
|
| ...va/io/opentelemetry/sdk/logs/NoopLogProcessor.java | 100.00% <0.00%> (ø) |
|
| ... and 160 more |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Thanks so much for this improvement! I will have someone from AWS take a look at this shortly
This PR was marked stale due to lack of activity. It will be closed in 14 days.
Bump: Please don't close this PR automatically.
Thanks so much for this improvement! I will have someone from AWS take a look at this shortly
@willarmiros any update here?
Apologies for the delay here, we will be having someone review this shortly!
@felixscheinost i missed out on this PR and worked on #4670. Your PR is effectively a superset of mine, with the exception of AWS_ECS_LAUNCHTYPE and AWS_LOGS_STREAM_ARNS and a couple small bugs in the generation of the AWS Log Group ARN. I prepared a fix for your PR in https://github.com/lumigo-io/opentelemetry-java/tree/pr-for-felix. @willarmiros is going to have a look at these PRs in the next few days. Would you be interested in pulling into your PR my changes? :-)
@mmanciop Thanks for rebasing your changes on my PR!
Regarding the change in the log group ARN. That dosn't seem right to me. Have you tested this on AWS?
I just changed this line on my branch, to test this change in isolation, and deployed to our test evironment.
// from
attributesBuilder.put(ResourceAttributes.AWS_LOG_GROUP_ARNS, "arn:aws:logs:" + region + ":" + account + ":log-group:" + logGroupName);
// to
attributesBuilder.put(ResourceAttributes.AWS_LOG_GROUP_ARNS, "arn:aws:logs:" + region + ":" + account + ":log-group:/aws" + logGroupName + ":*");
With this change in place I can no longer see the logs for a trace in CloudWatch.
If I view a trace, for which I previously could view logs, and click on "Resources" for "Cloudwatch logs" I see the following:
[
{
"log_group": "*",
"arn": "arn:aws:logs:eu-central-1:<account>:log-group:/aws<the name of my log group>:*"
}
]
I also tried adding another separting / between /aws and logGroupName but that also doesn't work. In CloudWatch I then see:
[
{
"log_group": "*",
"arn": "arn:aws:logs:eu-central-1:<account>:log-group:/aws/<the name of my log group>:*"
}
]
But that is also wrong as my log group is just called <the name of my log group> and not /aws/<the name of my log group>.
@mmanciop Thanks for rebasing your changes on my PR!
Regarding the change in the log group ARN. That dosn't seem right to me. Have you tested this on AWS?
I just changed this line on my branch, to test this change in isolation, and deployed to our test evironment.
// from attributesBuilder.put(ResourceAttributes.AWS_LOG_GROUP_ARNS, "arn:aws:logs:" + region + ":" + account + ":log-group:" + logGroupName); // to attributesBuilder.put(ResourceAttributes.AWS_LOG_GROUP_ARNS, "arn:aws:logs:" + region + ":" + account + ":log-group:/aws" + logGroupName + ":*");With this change in place I can no longer see the logs for a trace in CloudWatch.
If I view a trace, for which I previously could view logs, and click on "Resources" for "Cloudwatch logs" I see the following:
[ { "log_group": "*", "arn": "arn:aws:logs:eu-central-1:<account>:log-group:/aws<the name of my log group>:*" } ]I also tried adding another separting
/between/awsandlogGroupNamebut that also doesn't work. In CloudWatch I then see:[ { "log_group": "*", "arn": "arn:aws:logs:eu-central-1:<account>:log-group:/aws/<the name of my log group>:*" } ]But that is also wrong as my log group is just called
<the name of my log group>and not/aws/<the name of my log group>.
I compared the ARNs with the output of aws logs describe-log-groups and aws logs describe-log-streams. At first, when I started investigating the change, I was as surprised as you. @willarmiros do you concur with these ARN values?
@mmanciop When I do describe-log-groups on my log groups I get an ARN without /aws.
Only when the log group name contains /aws, like for lambdas, does the ARN also contain /aws.
@mmanciop When I do
describe-log-groupson my log groups I get an ARN without/aws.Only when the log group name contains
/aws, like for lambdas, does the ARN also contain/aws.
I tried it on ECS, but this bears more validation. Brb :-)
So, what I see in the ECS console when deploying a simple CDK + Amazon ECS project is this:
awslogs-group | /aws/batch/job
-- | --
awslogs-region | eu-central-1
awslogs-stream-prefix | CrawlerJobDefinition4D5-55fb222b188f18e
The log-group is called /aws/batch/job in CloudWatch and the ARN is arn:aws:logs:eu-central-1:170906415945:log-group:/aws/batch/job:*.
The AWS CLI reports:
% aws logs describe-log-groups | jq '.logGroups[] | select(.logGroupName == "/aws/batch/job")'
{
"logGroupName": "/aws/batch/job",
"creationTime": 1658054662260,
"metricFilterCount": 0,
"arn": "arn:aws:logs:eu-central-1:170906415945:log-group:/aws/batch/job:*",
"storedBytes": 141110571
}
So, yes, we should not add programmatically /aws. The :* at the end of the ARN for the log-group, however, seem pretty legit. Do you agree @felixscheinost ?
@felixscheinost I pushed a commit to remove the spurious /aws prefix from the ARNs. Thanks for catching that :-)
The :* at the end of the ARN for the log-group, however, seem pretty legit.
Yeah, I think so. It seems like it does work without it but both the CLI and aws.amazon.com display log groups with that trailing :*.
Okay, I will manually test the other changes from your commit after the weekend and if everything looks fine (which I assume it will), pull them into this PR.
Is it okay for you if I squash your two commits into one and push it on top of this branch?
Is it okay for you if I squash your two commits into one and push it on top of this branch?
Yup, I think in the end it is best to squash the entire PR on merge anyhow
Hi @mmanciop, unfortunately even the trailing :* seems to break showing the logs for a trace.
I cherry picked all your changes, deployed and tried showing the logs for a trace but didn't see any. As sometimes the logs take a while to show up I waited and tried again but still no logs.
Then I removed the trailing :* from the log group ARNs, deployed again and now I can see the logs again.
The output for the trace under "Resources" -> "CloudWatch logs" is:
[
{
"log_group": "*",
"arn": "arn:aws:logs:eu-central-1:<account number>:log-group:<log group name>:*"
}
]
vs this:
[
{
"log_group": "<log group name>",
"arn": "arn:aws:logs:eu-central-1:<account number>:log-group:<log group name>"
}
]
So it seems CloudWatch is very peculiar about how the log group ARNs are formatted. Maybe someone from AWS can verify this.
I cherry picked all your changes, deployed and tried showing the logs for a trace but didn't see any. As sometimes the logs take a while to show up I waited and tried again but still no logs.
Then I removed the trailing :* from the log group ARNs, deployed again and now I can see the logs again.
@felixscheinost how exactly are you using these ARNs? With which tool? Because the issue could be there too.
Besides, the spec clearly states with the example that the :* must be there: https://opentelemetry.io/docs/reference/specification/resource/semantic_conventions/cloud_provider/aws/logs/
I am using CloudWatch traces on the official AWS web console.
Besides, the spec clearly states with the example that the :* must be there: https://opentelemetry.io/docs/reference/specification/resource/semantic_conventions/cloud_provider/aws/logs/
These are merely "examples".
This links also references the official AWS documentation, which states:
Both of the following are used. The second one, with the * at the end, is what is returned by the describe-log-groups CLI command and the DescribeLogGroups API. arn:aws:logs:region:account-id:log-group:log_group_name arn:aws:logs:region:account-id:log-group:log_group_name:*
I think in the end what works is what's correct. 3rd party tools which parse aws.log.group.arns can be adapted. AWS CloudWatch can hardly be adapted.
I suppose @willarmiros, @Aneurysm9 should be making the call and, if :* has to go, update the examples in the spec. However, considering that in CloudWatch actually is showing even in the console ARNs with the :*, I think it is a bug that X-Ray should fix.

Sorry for being late here: so from my reading there were 2 issues:
/awswas being pre-pended to log groups in the resource detector, but that was fixed to no longer do that (which is the correct fix)- There is a question of whether to include the trailing
:*in the log group ARN when recording it on the trace. It seems like the expected behavior is "it doesn't matter" because both such ARN formats are valid, however in practice only the ARN without the trailing:*works. I have just found the bug on our end here: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/8a821bb7829a23924edd2e23b22c6751277ab2ea/exporter/awsxrayexporter/internal/translator/aws.go#L272-L279
I will submit a fix for the bug so that both ARN formats work. The spec should stay as-is because the examples are still valid formats, just our parsing code is wrong. However, for now it would be best if the resource detectors could record the ARNs without the trailing :* so that everything still works while we wait for the fix to land.
Hi @mmanciop - I have opened the PR to address the :* issue on the X-Ray exporter side, and @jj22ee has left a few comments but approved. Would you mind addressing the comments so we can then merge?
@willarmiros I cannot push to @felixscheinost's branch (I have to push rights for his fork, which he used to open this PR), but I prepared the commit on https://github.com/lumigo-io/opentelemetry-java/tree/pr-for-felix .
I addressed @jj22ee's comments and cherry-picked @mmanciop's additions.
@willarmiros
However, for now it would be best if the resource detectors could record the ARNs without the trailing :* so that everything still works while we wait for the fix to land.
I pushed another commit which removes the trailing :* for now.
@willarmiros when you think this is ready, please approve and we can move forward. Thanks!