dd-trace-py icon indicating copy to clipboard operation
dd-trace-py copied to clipboard

feat(asm): redact sensitive querystrings in http.url

Open gnufede opened this issue 3 years ago • 1 comments

Description

By default query strings are not sent, unless trace_query_string is enabled. In case trace_query_string is enabled, sensitive query strings will be redacted to be sent in http.url tag.

Checklist

Motivation

ASM will enable security analysis for the query strings in the backend, so query strings will be sent by default once sensitive query strings are redacted.

Design

This PR enables sensitive query strings redaction, next step will be enabling trace_query_string by default.

Testing strategy

  • A test added checking with an example of a sensitive query string.
  • System tests for sensitive query string obfuscation.
  • Performance test to evaluate re.sub overhead.

Reviewer Checklist

  • [ ] Title is accurate.
  • [ ] Description motivates each change.
  • [ ] No unnecessary changes were introduced in this PR.
  • [ ] PR cannot be broken up into smaller PRs.
  • [ ] Avoid breaking API changes unless absolutely necessary.
  • [ ] Tests provided or description of manual testing performed is included in the code or PR.
  • [ ] Release note has been added for fixes and features, or else changelog/no-changelog label added.
  • [ ] All relevant GitHub issues are correctly linked.
  • [ ] Backports are identified and tagged with Mergifyio.
  • [ ] Add to milestone.

gnufede avatar Jul 21 '22 15:07 gnufede

Codecov Report

Merging #3980 (66d82f4) into 1.x (a2c53ee) will decrease coverage by 0.38%. The diff coverage is 45.75%.

@@            Coverage Diff             @@
##              1.x    #3980      +/-   ##
==========================================
- Coverage   78.58%   78.19%   -0.39%     
==========================================
  Files         724      730       +6     
  Lines       58358    58801     +443     
==========================================
+ Hits        45860    45980     +120     
- Misses      12498    12821     +323     
Impacted Files Coverage Δ
ddtrace/contrib/aiohttp/patch.py 93.33% <ø> (ø)
tests/commands/test_runner.py 0.00% <ø> (ø)
tests/contrib/flask/test_flask_appsec.py 0.00% <0.00%> (ø)
tests/debugging/exploration/_coverage.py 0.00% <0.00%> (ø)
tests/debugging/exploration/_profiler.py 0.00% <0.00%> (ø)
tests/debugging/exploration/debugger.py 0.00% <0.00%> (ø)
tests/debugging/exploration/preload.py 0.00% <0.00%> (ø)
tests/debugging/exploration/sitecustomize.py 0.00% <0.00%> (ø)
tests/integration/test_integration.py 0.00% <0.00%> (ø)
tests/profiling/_test_multiprocessing.py 0.00% <0.00%> (ø)
... and 23 more

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov-commenter avatar Aug 03 '22 16:08 codecov-commenter

@gnufede Here are the results from the microbenchmark for set_htto_meta on the baseline (1.x) compared to this PR (b73a51). I will get you macrobenchmark numbers by end of day today.

Benchmark baseline comparison
sethttpmeta-useragentvariant_exists_2 40.0 us 40.4 us: 1.01x slower
sethttpmeta-useragentvariant_not_exists_1 35.3 us 35.1 us: 1.00x faster
sethttpmeta-obfuscation-worst-case-implicit-query 34.8 us 110 us: 3.18x slower
sethttpmeta-obfuscation-worst-case-explicit-query 34.8 us 107 us: 3.09x slower
sethttpmeta-obfuscation-regular-case-implicit-query 34.6 us 62.8 us: 1.82x slower
sethttpmeta-obfuscation-regular-case-explicit-query 34.5 us 62.5 us: 1.81x slower
sethttpmeta-obfuscation-no-query 34.6 us 40.5 us: 1.17x slower
sethttpmeta-obfuscation-send-querystring-disabled 34.9 us 34.7 us: 1.01x faster
sethttpmeta-obfuscation-disabled 34.6 us 35.0 us: 1.01x slower
Geometric mean (ref) 1.30x slower

majorgreys avatar Aug 22 '22 14:08 majorgreys