dd-trace-js icon indicating copy to clipboard operation
dd-trace-js copied to clipboard

aws payload tagging

Open tlhunter opened this issue 1 year ago • 7 comments

This PR rebuilds #4131. It removes hundreds of files worth of whitespace changes and rebuilds yarn.lock based on current master branch. Ultimately @jbertran will have done 90% of the work in this PR.


What does this PR do?

  • This PR introduces AWS payload reporting as tags. The feature is entirely locked behind environment variables
  • I'd like to release this without documenting the feature / env vars, then have the Serverless team test it
  • Once they're happy with it and any data format changes are made then we'd document the variables

Configuration

We introduce 3 new environment variables:

  • DD_TRACE_CLOUD_REQUEST_PAYLOAD_TAGGING defines the activation of the feature for requests, values being either "all" (no additional redactionor a comma-separated list of JSONPath queries identifying payload paths to be replaced with the value"redacted"`.
  • DD_TRACE_CLOUD_RESPONSE_PAYLOAD_TAGGING
  • DD_TRACE_CLOUD_PAYLOAD_TAGGING_MAX_DEPTH sets the depth after which we stop creating tags from a payload

Behaviour

With the feature activated, aws-sdk calls to the enabled plugins will create additional tags representing the payload, with the following modifications:

  1. Paths known to be PII/sensitive are hard-coded to be redacted (service by service)
  2. Paths known to be user-input data likely to contain JSON are expanded
  3. Paths matching the JSONPath queries passed by the environment variables or corresponding runtime tracer configuration are redacted

This PR only provides the feature for SNS as a first service, but the framework introduced here only requires slight adaptations of a given AWS service plugin to make it available, as well as the addition of the static PII fields configuration.

New dependencies

Adding jsonpath seems safe given the constraints it imposes on its scripts, even if I don't expect scripts to be used. Using rfdc is more questionable - we need a deep clone because JSONPath apply can only do side-effects, and we must not modify the payload, but maybe something simpler works.

Remaining work

In some cases, JSONPath filter expressions are not sufficient to do what we want.

For example, setting attributes for entities (like SNS topics) requires setting an AttributeName and an AttributeValue at top-level of the JSON payload. Ideally, we should be able to redact the AttributeValue only when the AttributeName matches a disallowed value (for example KMSMasterKeyId). JSONPath syntax does not allow such a complex query, so we need to also specify custom logic hooks that do not go through JSONPath to redact data.

Motivation

This come from:

  1. the desire to have real-world data correlated with traces
  2. the fact that AWS upstream API is well-defined and well-documented, helping us avoid PII/sensitive data pitfalls
  3. the existence of such a mechanism in datadog-lambda-js, but only scoped to lambda function input and output. This provides the same level of information, with additional redaction granularity, for AWS plugins.

Plugin Checklist

Additional Notes

Security

Datadog employees:

  • [ ] If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
  • [x] This PR doesn't touch any of that.

Unsure? Have a question? Request a review!

tlhunter avatar May 15 '24 18:05 tlhunter

Overall package size

Self size: 7.03 MB Deduped: 62.39 MB No deduping: 62.67 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | @datadog/native-appsec | 8.1.1 | 18.67 MB | 18.68 MB | | @datadog/native-iast-taint-tracking | 3.1.0 | 12.27 MB | 12.28 MB | | @datadog/pprof | 5.3.0 | 9.85 MB | 10.22 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.4.1 | 2.14 MB | 2.23 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 2.0.0 | 898.77 kB | 1.3 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | jsonpath-plus | 9.0.0 | 580.4 kB | 1.03 MB | | import-in-the-middle | 1.8.1 | 71.67 kB | 785.15 kB | | msgpack-lite | 0.1.26 | 201.16 kB | 281.59 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | lru-cache | 7.14.0 | 74.95 kB | 74.95 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | int64-buffer | 0.1.10 | 49.18 kB | 49.18 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.3.1 | 25.21 kB | 25.21 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | path-to-regexp | 0.1.7 | 6.78 kB | 6.78 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

github-actions[bot] avatar May 15 '24 18:05 github-actions[bot]

Codecov Report

Attention: Patch coverage is 96.34146% with 3 lines in your changes missing coverage. Please review.

Project coverage is 86.52%. Comparing base (a34a685) to head (c335ac8). Report is 7 commits behind head on master.

Files Patch % Lines
packages/dd-trace/src/payload-tagging/index.js 87.50% 3 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4309      +/-   ##
==========================================
- Coverage   95.85%   86.52%   -9.33%     
==========================================
  Files         105      253     +148     
  Lines        3451    10925    +7474     
  Branches       33       33              
==========================================
+ Hits         3308     9453    +6145     
- Misses        143     1472    +1329     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar May 30 '24 19:05 codecov[bot]

It looks like the jsonpath module might not be very friendly with ESBuild:


  esbuild
cd /home/runner/work/dd-trace-js/dd-trace-js/integration-tests/esbuild
npm run build
Warning: G] "../include/module.js" should be marked as external for use with "require.resolve" [require-resolve-not-external]

    ../../node_modules/jsonpath/lib/grammar.js:102:58:
      102 │ ...clude = fs.readFileSync(require.resolve("../include/module.js"));
          ╵                                            ~~~~~~~~~~~~~~~~~~~~~~

Warning: G] "../include/action.js" should be marked as external for use with "require.resolve" [require-resolve-not-external]

    ../../node_modules/jsonpath/lib/grammar.js:103:58:
      103 │ ...clude = fs.readFileSync(require.resolve("../include/action.js"));
          ╵                                            ~~~~~~~~~~~~~~~~~~~~~~

Warning: G] "esprima" should be marked as external for use with "require.resolve" [require-resolve-not-external]

    ../../node_modules/jsonpath/lib/aesprim.js:4:27:
      4 │ var file = require.resolve('esprima');
        ╵                            ~~~~~~~~~

npm run built
node:internal/modules/cjs/loader:1189
  throw err;
  ^

Error: Cannot find module '../include/module.js'
Require stack:
- /home/runner/work/dd-trace-js/dd-trace-js/integration-tests/esbuild/out.js
    at Module._resolveFilename (node:internal/modules/cjs/loader:1186:15)
    at Function.resolve (node:internal/modules/helpers:133:19)
    at ../../node_modules/jsonpath/lib/grammar.js (/home/runner/work/dd-trace-js/dd-trace-js/integration-tests/esbuild/out.js:34936:55)
    at __require (/home/runner/work/dd-trace-js/dd-trace-js/integration-tests/esbuild/out.js:12:51)
    at ../../node_modules/jsonpath/lib/parser.js (/home/runner/work/dd-trace-js/dd-trace-js/integration-tests/esbuild/out.js:35589:19)
    at __require (/home/runner/work/dd-trace-js/dd-trace-js/integration-tests/esbuild/out.js:12:51)
    at ../../node_modules/jsonpath/lib/index.js (/home/runner/work/dd-trace-js/dd-trace-js/integration-tests/esbuild/out.js:42469:18)
    at __require (/home/runner/work/dd-trace-js/dd-trace-js/integration-tests/esbuild/out.js:12:51)
    at ../../node_modules/jsonpath/index.js (/home/runner/work/dd-trace-js/dd-trace-js/integration-tests/esbuild/out.js:42655:23)
    at __require (/home/runner/work/dd-trace-js/dd-trace-js/integration-tests/esbuild/out.js:12:51) {
  code: 'MODULE_NOT_FOUND',
  requireStack: [
    '/home/runner/work/dd-trace-js/dd-trace-js/integration-tests/esbuild/out.js'
  ]
}

tlhunter avatar May 31 '24 15:05 tlhunter

The ~~forked~~ library jsonpath-plus has about 50% more downloads than jsonpath and fixes the issue so I'll switch to that: https://www.npmjs.com/package/jsonpath-plus

tlhunter avatar May 31 '24 15:05 tlhunter

Benchmarks

Benchmark execution time: 2024-08-30 22:02:55

Comparing candidate commit ea29c12598948ab25215ea3db40fce1b769d5bc4 in PR branch tlhunter/aws-payload-tagging with baseline commit 34a651effd9202d6ee1cffd2930886f615fe7956 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 258 metrics, 8 unstable metrics.

pr-commenter[bot] avatar May 31 '24 16:05 pr-commenter[bot]

It looks like things are finally starting to pass. Here's some of the SPAN logs in CI:

  meta: {
    '_dd.p.tid': '668ef02c00000000',
    '_dd.p.dm': '-1',
    service: 'test',
    env: 'tester',
    version: '9.2.2',
    'runtime-id': 'c18a5eec-e0c8-464d-87a0-5cd32ec1068c',
    'span.kind': 'client',
    'aws.operation': 'publish',
    'aws.region': 'us-east-1',
    region: 'us-east-1',
    aws_service: 'SNS',
    'aws.service': 'SNS',
    component: 'aws-sdk',
    'pathway.hash': '515580026984968434',
    'aws.request.body.TopicArn': 'arn:aws:sns:us-east-1:00000000000000000000:TestTopic',
    'aws.request.body.Message': 'message 1',
    'aws.request.body.MessageAttributes.baz.DataType': 'String',
    'aws.request.body.MessageAttributes.baz.StringValue': 'bar',
    'aws.request.body.MessageAttributes.keyOne.DataType': 'String',
    'aws.request.body.MessageAttributes.keyOne.StringValue': 'keyOne',
    'aws.request.body.MessageAttributes.keyTwo.DataType': 'String',
    'aws.request.body.MessageAttributes.keyTwo.StringValue': 'keyTwo',
    'aws.request.body.MessageAttributes._datadog.DataType': 'Binary',
    'aws.request.body.MessageAttributes._datadog.BinaryValue': '{"x-datadog-trace-id":"4453572687089088180","x-datadog-parent-id":"4453572687089088180","x-datadog-sampling-priority":"1","x-datadog-tags":"_dd.p.tid=668ef02c00000000,_dd.p.dm=-1","traceparent":"00-668ef02c000000003dce44c34b53b6b4-3dce44c34b53b6b4-01","tracestate":"dd=t.tid:668ef02c00000000;t.dm:-1;s:1;p:3dce44c34b53b6b4","dd-pathway-ctx-base64":"Bye1TYm5sPLOt9Hlk2TOt9Hlk2Q="}',
    'aws.response.request_id': '74f85857-1a7b-45d7-9499-42d33e0093b5',
    'aws.sns.topic_arn': 'arn:aws:sns:us-east-1:00000000000000000000:TestTopic',
    topicname: 'TestTopic',
    'aws.response.body.ResponseMetadata.RequestId': '74f85857-1a7b-45d7-9499-42d33e0093b5',
    'aws.response.body.MessageId': 'redacted',
    '_dd.base_service': 'test',
    language: 'javascript'
  },

tlhunter avatar Jul 11 '24 00:07 tlhunter

It looks like the only failing job at this point is that the tracer is now exceeding a size threshold set by the serverless team. I've pinged them for assistance.

tlhunter avatar Aug 28 '24 21:08 tlhunter