dd-trace-js icon indicating copy to clipboard operation
dd-trace-js copied to clipboard

add support to api security sampling

Open IlyasShabi opened this issue 1 year ago • 2 comments

What does this PR do?

  • Implements a new API security sampling algorithm using an LRU cache with a 30s TTL
  • Delays schema extraction until the end of the request processing

Motivation

The current API security sampling method randomly selects 10% of requests. This PR introduces new algorithm based on request priority, utilizing an LRU cache with TTL for each { url, statusCode, method }

Delaying the schema extraction to the end of the request ensures that we have an accurate priority value

Plugin Checklist

IlyasShabi avatar Oct 03 '24 07:10 IlyasShabi

Overall package size

Self size: 7.97 MB Deduped: 65.01 MB No deduping: 65.35 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | @datadog/native-appsec | 8.2.1 | 19.18 MB | 19.19 MB | | @datadog/native-iast-taint-tracking | 3.2.0 | 13.9 MB | 13.91 MB | | @datadog/pprof | 5.4.1 | 9.76 MB | 10.13 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.5.0 | 2.51 MB | 2.65 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 3.0.1 | 1.06 MB | 1.46 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.11.2 | 112.74 kB | 826.22 kB | | msgpack-lite | 0.1.26 | 201.16 kB | 281.59 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | lru-cache | 7.18.3 | 133.92 kB | 133.92 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | int64-buffer | 0.1.10 | 49.18 kB | 49.18 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.3.1 | 25.21 kB | 25.21 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | path-to-regexp | 0.1.10 | 6.38 kB | 6.38 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

github-actions[bot] avatar Oct 03 '24 07:10 github-actions[bot]

Benchmarks

Benchmark execution time: 2024-11-13 08:30:06

Comparing candidate commit 37d6aab91ff7ec49c3dcd27f09beaed3103ffc7d in PR branch api-security-sampling with baseline commit 1ee800011164cec06c368f1d3692671362d46468 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 259 metrics, 7 unstable metrics.

pr-commenter[bot] avatar Oct 03 '24 12:10 pr-commenter[bot]

Codecov Report

Attention: Patch coverage is 84.37500% with 5 lines in your changes missing coverage. Please review.

Project coverage is 48.27%. Comparing base (564795f) to head (8303d2f). Report is 23 commits behind head on master.

Files with missing lines Patch % Lines
...ckages/dd-trace/src/appsec/api_security_sampler.js 86.20% 4 Missing :warning:
packages/dd-trace/src/appsec/index.js 66.66% 1 Missing :warning:
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #4755       +/-   ##
===========================================
- Coverage   79.17%   48.27%   -30.91%     
===========================================
  Files         273      107      -166     
  Lines       12427     3383     -9044     
===========================================
- Hits         9839     1633     -8206     
+ Misses       2588     1750      -838     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Nov 05 '24 09:11 codecov[bot]

LGTM, let's wait for @simon-id opinion

iunanua avatar Nov 05 '24 11:11 iunanua

Code LGTM now, but haven't had time to fully review the tests. still I found some problems

simon-id avatar Nov 07 '24 16:11 simon-id

You got some system tests failures for API Security scenario, is it just the old tests that we should disable now ?

simon-id avatar Nov 12 '24 13:11 simon-id

Code and tests LGTM, but let's resolve that CI failure

simon-id avatar Nov 12 '24 15:11 simon-id

In some scenarios, we need to specify that the sampling delay is 0 "DD_API_SECURITY_SAMPLE_DELAY": "0.0" which means we want to sample all requests. Unfortunately, the @isaacs/ttlcache library does not support ttl: 0, so we need to handle this ourselves.

Here are some failing system test scenarios:

Solutions:

  • As implemented in the last commit, if the delay is 0, we instantiate a dummy TTL cache with basic functions to handle this edge case.
  • Completely remove the @isaacs/ttlcache library and replace it with a custom implementation, similar to the python implementation

WDYT? @iunanua @simon-id

IlyasShabi avatar Nov 13 '24 08:11 IlyasShabi