datadog-agent
datadog-agent copied to clipboard
Read 128 bit trace ID in End Invocation
What does this PR do?
@duncanista pointed out we also need to read the upper 64 bits of a trace ID in the End Invocation flow. This is a quick hack to add that. The right way, of course, would be using tracer.Inject and tracer.Extract.
Motivation
Describe how to test/QA your changes
Possible Drawbacks / Trade-offs
Additional Notes
Test changes on VM
Use this command from test-infra-definitions to manually test this PR changes on a VM:
inv create-vm --pipeline-id=48492969 --os-family=ubuntu
Note: This applies to commit 78cb994a
Regression Detector
/merge
:steam_locomotive: MergeQueue: waiting for PR to be ready
This merge request is not mergeable yet, because of pending checks/missing approvals. It will be added to the queue as soon as checks pass and/or get approvals.
Note: if you pushed new commits since the last approval, you may need additional approval.
You can remove it from the waiting list with /remove command.
Use /merge -c to cancel this operation!
/remove
:steam_locomotive: Devflow: /remove
:warning: MergeQueue: This merge request was unqueued
This merge request was unqueued
If you need support, contact us on Slack #devflow!
Serverless Benchmark Results
BenchmarkStartEndInvocation comparison between 0c2f3b04b2b6a72190dcf53a36adf6c0d711aea7 and bacb5604700b2af18a68d0b265c62084b79a8c52.
tl;dr
Use these benchmarks as an insight tool during development.
-
Skim down the
vs basecolumn in each chart. If there is a~, then there was no statistically significant change to the benchmark. Otherwise, ensure the estimated percent change is either negative or very small. -
The last row of each chart is the
geomean. Ensure this percentage is either negative or very small.
What is this benchmarking?
The BenchmarkStartEndInvocation compares the amount of time it takes to call the start-invocation and end-invocation endpoints. For universal instrumentation languages (Dotnet, Golang, Java, Ruby), this represents the majority of the duration overhead added by our tracing layer.
The benchmark is run using a large variety of lambda request payloads. In the charts below, there is one row for each event payload type.
How do I interpret these charts?
The charts below comes from benchstat. They represent the statistical change in duration (sec/op), memory overhead (B/op), and allocations (allocs/op).
The benchstat docs explain how to interpret these charts.
Before the comparison table, we see common file-level configuration. If there are benchmarks with different configuration (for example, from different packages), benchstat will print separate tables for each configuration.
The table then compares the two input files for each benchmark. It shows the median and 95% confidence interval summaries for each benchmark before and after the change, and an A/B comparison under "vs base". ... The p-value measures how likely it is that any differences were due to random chance (i.e., noise). The "~" means benchstat did not detect a statistically significant difference between the two inputs. ...
Note that "statistically significant" is not the same as "large": with enough low-noise data, even very small changes can be distinguished from noise and considered statistically significant. It is, of course, generally easier to distinguish large changes from noise.
Finally, the last row of the table shows the geometric mean of each column, giving an overall picture of how the benchmarks changed. Proportional changes in the geomean reflect proportional changes in the benchmarks. For example, given n benchmarks, if sec/op for one of them increases by a factor of 2, then the sec/op geomean will increase by a factor of ⁿ√2.
I need more help
First off, do not worry if the benchmarks are failing. They are not tests. The intention is for them to be a tool for you to use during development.
If you would like a hand interpreting the results come chat with us in #serverless-agent in the internal DataDog slack or in #serverless in the public DataDog slack. We're happy to help!
Benchmark stats
goos: linux
goarch: amd64
pkg: github.com/DataDog/datadog-agent/pkg/serverless/daemon
cpu: AMD EPYC 7763 64-Core Processor
│ baseline/benchmark.log │ current/benchmark.log │
│ sec/op │ sec/op vs base │
api-gateway-appsec.json 86.89µ ± 3% 85.73µ ± 2% ~ (p=0.247 n=10)
api-gateway-kong-appsec.json 68.02µ ± 9% 66.61µ ± 2% -2.06% (p=0.029 n=10)
api-gateway-kong.json 65.80µ ± 4% 66.27µ ± 3% ~ (p=0.529 n=10)
api-gateway-non-proxy-async.json 103.6µ ± 2% 105.2µ ± 1% +1.52% (p=0.011 n=10)
api-gateway-non-proxy.json 105.3µ ± 1% 105.3µ ± 1% ~ (p=0.971 n=10)
api-gateway-websocket-connect.json 69.04µ ± 1% 70.32µ ± 1% +1.85% (p=0.002 n=10)
api-gateway-websocket-default.json 62.12µ ± 1% 63.66µ ± 1% +2.48% (p=0.000 n=10)
api-gateway-websocket-disconnect.json 62.11µ ± 1% 64.10µ ± 2% +3.20% (p=0.000 n=10)
api-gateway.json 112.5µ ± 3% 115.2µ ± 1% ~ (p=0.052 n=10)
application-load-balancer.json 62.81µ ± 1% 65.09µ ± 1% +3.62% (p=0.000 n=10)
cloudfront.json 47.40µ ± 3% 48.84µ ± 2% +3.05% (p=0.001 n=10)
cloudwatch-events.json 37.82µ ± 3% 39.87µ ± 1% +5.42% (p=0.000 n=10)
cloudwatch-logs.json 65.06µ ± 2% 67.35µ ± 1% +3.52% (p=0.000 n=10)
custom.json 30.42µ ± 2% 32.20µ ± 2% +5.84% (p=0.000 n=10)
dynamodb.json 91.71µ ± 2% 93.29µ ± 1% +1.72% (p=0.007 n=10)
empty.json 28.81µ ± 3% 30.33µ ± 2% +5.26% (p=0.000 n=10)
eventbridge-custom.json 46.88µ ± 2% 47.82µ ± 5% +2.02% (p=0.002 n=10)
eventbridge-no-bus.json 45.94µ ± 2% 47.49µ ± 2% +3.38% (p=0.001 n=10)
eventbridge-no-timestamp.json 46.12µ ± 1% 47.27µ ± 1% +2.50% (p=0.000 n=10)
eventbridgesns.json 62.69µ ± 2% 65.20µ ± 2% +4.01% (p=0.000 n=10)
eventbridgesqs.json 71.96µ ± 3% 72.92µ ± 1% ~ (p=0.165 n=10)
http-api.json 73.93µ ± 3% 73.74µ ± 2% ~ (p=0.190 n=10)
kinesis-batch.json 72.32µ ± 3% 71.48µ ± 2% ~ (p=0.315 n=10)
kinesis.json 55.63µ ± 2% 54.83µ ± 2% -1.43% (p=0.043 n=10)
s3.json 61.13µ ± 2% 60.91µ ± 1% ~ (p=0.739 n=10)
sns-batch.json 91.54µ ± 2% 92.69µ ± 1% ~ (p=0.105 n=10)
sns.json 68.12µ ± 2% 68.92µ ± 2% ~ (p=0.165 n=10)
snssqs.json 119.2µ ± 2% 119.9µ ± 2% ~ (p=0.247 n=10)
snssqs_no_dd_context.json 107.5µ ± 2% 107.9µ ± 2% ~ (p=0.631 n=10)
sqs-aws-header.json 60.68µ ± 2% 59.70µ ± 2% ~ (p=0.063 n=10)
sqs-batch.json 98.98µ ± 2% 96.31µ ± 2% -2.69% (p=0.003 n=10)
sqs.json 73.02µ ± 4% 72.77µ ± 2% ~ (p=0.579 n=10)
sqs_no_dd_context.json 69.56µ ± 2% 68.85µ ± 3% ~ (p=0.315 n=10)
stepfunction.json 47.78µ ± 3% 49.19µ ± 4% ~ (p=0.143 n=10)
geomean 65.94µ 66.86µ +1.39%
│ baseline/benchmark.log │ current/benchmark.log │
│ B/op │ B/op vs base │
api-gateway-appsec.json 37.34Ki ± 0% 37.35Ki ± 0% +0.04% (p=0.022 n=10)
api-gateway-kong-appsec.json 26.93Ki ± 0% 26.95Ki ± 0% +0.06% (p=0.006 n=10)
api-gateway-kong.json 24.43Ki ± 0% 24.45Ki ± 0% +0.08% (p=0.000 n=10)
api-gateway-non-proxy-async.json 48.13Ki ± 0% 48.13Ki ± 0% ~ (p=1.000 n=10)
api-gateway-non-proxy.json 47.35Ki ± 0% 47.35Ki ± 0% ~ (p=0.542 n=10)
api-gateway-websocket-connect.json 25.53Ki ± 0% 25.54Ki ± 0% ~ (p=0.565 n=10)
api-gateway-websocket-default.json 21.44Ki ± 0% 21.44Ki ± 0% ~ (p=0.158 n=10)
api-gateway-websocket-disconnect.json 21.22Ki ± 0% 21.22Ki ± 0% ~ (p=0.127 n=10)
api-gateway.json 49.59Ki ± 0% 49.60Ki ± 0% ~ (p=0.780 n=10)
application-load-balancer.json 23.31Ki ± 0% 23.32Ki ± 0% ~ (p=0.323 n=10)
cloudfront.json 17.67Ki ± 0% 17.69Ki ± 0% +0.11% (p=0.006 n=10)
cloudwatch-events.json 11.73Ki ± 0% 11.76Ki ± 0% +0.23% (p=0.000 n=10)
cloudwatch-logs.json 53.38Ki ± 0% 53.39Ki ± 0% +0.02% (p=0.022 n=10)
custom.json 9.746Ki ± 0% 9.775Ki ± 0% +0.30% (p=0.000 n=10)
dynamodb.json 40.81Ki ± 0% 40.82Ki ± 0% ~ (p=0.075 n=10)
empty.json 9.294Ki ± 0% 9.339Ki ± 0% +0.49% (p=0.000 n=10)
eventbridge-custom.json 15.01Ki ± 0% 15.02Ki ± 0% ~ (p=0.838 n=10)
eventbridge-no-bus.json 13.99Ki ± 0% 14.00Ki ± 0% ~ (p=0.197 n=10)
eventbridge-no-timestamp.json 14.04Ki ± 0% 14.03Ki ± 0% ~ (p=1.000 n=10)
eventbridgesns.json 20.92Ki ± 0% 21.00Ki ± 0% +0.40% (p=0.023 n=10)
eventbridgesqs.json 25.17Ki ± 0% 25.17Ki ± 0% ~ (p=0.516 n=10)
http-api.json 23.93Ki ± 0% 23.94Ki ± 0% ~ (p=0.956 n=10)
kinesis-batch.json 27.16Ki ± 0% 27.15Ki ± 0% ~ (p=0.516 n=10)
kinesis.json 17.91Ki ± 0% 17.92Ki ± 0% ~ (p=0.469 n=10)
s3.json 20.49Ki ± 0% 20.50Ki ± 1% ~ (p=0.796 n=10)
sns-batch.json 39.87Ki ± 0% 39.99Ki ± 0% ~ (p=0.143 n=10)
sns.json 25.15Ki ± 1% 25.19Ki ± 0% ~ (p=0.218 n=10)
snssqs.json 53.87Ki ± 0% 53.94Ki ± 0% ~ (p=0.353 n=10)
snssqs_no_dd_context.json 47.68Ki ± 0% 47.64Ki ± 0% ~ (p=0.160 n=10)
sqs-aws-header.json 19.44Ki ± 1% 19.41Ki ± 1% ~ (p=0.955 n=10)
sqs-batch.json 42.35Ki ± 1% 42.25Ki ± 0% ~ (p=0.225 n=10)
sqs.json 26.30Ki ± 1% 26.25Ki ± 1% ~ (p=0.280 n=10)
sqs_no_dd_context.json 21.84Ki ± 1% 21.88Ki ± 1% ~ (p=0.853 n=10)
stepfunction.json 14.29Ki ± 1% 14.29Ki ± 1% ~ (p=0.543 n=10)
geomean 24.61Ki 24.62Ki +0.07%
│ baseline/benchmark.log │ current/benchmark.log │
│ allocs/op │ allocs/op vs base │
api-gateway-appsec.json 629.0 ± 0% 630.5 ± 0% +0.24% (p=0.001 n=10)
api-gateway-kong-appsec.json 488.0 ± 0% 489.0 ± 0% +0.20% (p=0.000 n=10)
api-gateway-kong.json 466.0 ± 0% 467.0 ± 0% +0.21% (p=0.000 n=10)
api-gateway-non-proxy-async.json 725.0 ± 0% 725.0 ± 0% ~ (p=1.000 n=10)
api-gateway-non-proxy.json 716.0 ± 0% 716.0 ± 0% ~ (p=1.000 n=10)
api-gateway-websocket-connect.json 453.0 ± 0% 453.0 ± 0% ~ (p=1.000 n=10) ¹
api-gateway-websocket-default.json 379.0 ± 0% 379.0 ± 0% ~ (p=1.000 n=10)
api-gateway-websocket-disconnect.json 370.0 ± 0% 370.0 ± 0% ~ (p=0.582 n=10)
api-gateway.json 791.0 ± 0% 791.0 ± 0% ~ (p=1.000 n=10)
application-load-balancer.json 353.0 ± 0% 353.0 ± 0% ~ (p=1.000 n=10) ¹
cloudfront.json 284.0 ± 0% 285.0 ± 0% +0.35% (p=0.000 n=10)
cloudwatch-events.json 220.0 ± 0% 221.0 ± 0% +0.45% (p=0.000 n=10)
cloudwatch-logs.json 215.0 ± 0% 216.0 ± 0% +0.47% (p=0.002 n=10)
custom.json 168.0 ± 0% 169.0 ± 1% +0.60% (p=0.000 n=10)
dynamodb.json 589.0 ± 0% 590.0 ± 0% +0.17% (p=0.001 n=10)
empty.json 159.0 ± 1% 161.0 ± 0% +1.26% (p=0.000 n=10)
eventbridge-custom.json 266.0 ± 0% 266.0 ± 0% ~ (p=0.303 n=10)
eventbridge-no-bus.json 257.0 ± 0% 257.5 ± 0% ~ (p=0.450 n=10)
eventbridge-no-timestamp.json 258.0 ± 0% 258.0 ± 0% ~ (p=1.000 n=10)
eventbridgesns.json 325.0 ± 0% 326.0 ± 0% ~ (p=0.082 n=10)
eventbridgesqs.json 367.0 ± 0% 367.0 ± 0% ~ (p=0.966 n=10)
http-api.json 434.0 ± 0% 434.0 ± 0% ~ (p=0.746 n=10)
kinesis-batch.json 392.0 ± 0% 393.0 ± 0% ~ (p=0.082 n=10)
kinesis.json 286.0 ± 0% 287.0 ± 0% +0.35% (p=0.003 n=10)
s3.json 359.0 ± 1% 360.0 ± 1% ~ (p=0.337 n=10)
sns-batch.json 477.5 ± 1% 479.5 ± 0% ~ (p=0.123 n=10)
sns.json 346.5 ± 1% 347.0 ± 1% ~ (p=0.287 n=10)
snssqs.json 478.0 ± 1% 479.0 ± 0% ~ (p=0.288 n=10)
snssqs_no_dd_context.json 438.0 ± 1% 438.0 ± 0% ~ (p=0.673 n=10)
sqs-aws-header.json 286.0 ± 1% 285.5 ± 1% ~ (p=0.922 n=10)
sqs-batch.json 517.0 ± 1% 516.0 ± 1% ~ (p=0.320 n=10)
sqs.json 365.5 ± 1% 364.0 ± 1% ~ (p=0.314 n=10)
sqs_no_dd_context.json 348.5 ± 1% 350.0 ± 1% ~ (p=0.534 n=10)
stepfunction.json 237.5 ± 1% 237.0 ± 0% ~ (p=0.408 n=10)
geomean 367.0 367.6 +0.16%
¹ all samples are equal
/merge
Devflow running: /merge
View all feedbacks in Devflow UI.
2024-11-07 20:57:42 UTC :information_source: MergeQueue: pull request added to the queue
The median merge time in main is 24m.