dd-trace-py icon indicating copy to clipboard operation
dd-trace-py copied to clipboard

chore(telemetry): use enum for telemetry log levels

Open romainkomorn-exdatadog opened this issue 1 year ago • 4 comments
trafficstars

Makes type-checking more accurate the telemetry writer's add_log() method by specifying an enum.

For example, improperly specifying TELEMETRY_LOG_LEVEL.POTATO instead of .ERROR would result in:

ddtrace/appsec/_iast/_metrics.py:77: error: "Type[TELEMETRY_LOG_LEVEL]" has no attribute "POTATO"  [attr-defined]

This was prompted by #9349 having failures in system-tests due to the use of WARNING instead of WARN.

The telemetry API docs currently specify:

Log level Remote collection policy
ERROR Enabled by default, configured by environment variable DD_TELEMETRY_LOG_COLLECTION_ENABLED
WARN Enabled by default, configured by same environment variable as ERROR
DEBUG Can be enabled locally by the end user when asked by the support team.

Checklist

  • [x] Change(s) are motivated and described in the PR description
  • [x] Testing strategy is described if automated tests are not included in the PR
  • [x] Risks are described (performance impact, potential for breakage, maintainability)
  • [x] Change is maintainable (easy to change, telemetry, documentation)
  • [x] Library release note guidelines are followed or label changelog/no-changelog is set
  • [x] Documentation is included (in-code, generated user docs, public corp docs)
  • [x] Backport labels are set (if applicable)
  • [x] If this PR changes the public interface, I've notified @DataDog/apm-tees.

Reviewer Checklist

  • [x] Title is accurate
  • [x] All changes are related to the pull request's stated goal
  • [x] Description motivates each change
  • [x] Avoids breaking API changes
  • [x] Testing strategy adequately addresses listed risks
  • [x] Change is maintainable (easy to change, telemetry, documentation)
  • [x] Release note makes sense to a user of the library
  • [x] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • [x] Backport labels are set in a manner that is consistent with the release branch maintenance policy

romainkomorn-exdatadog avatar Jun 12 '24 10:06 romainkomorn-exdatadog

Datadog Report

Branch report: romain.komorn/chore/enforce_log_telemetry_levels Commit report: a8197a7 Test service: dd-trace-py

:x: 6 Failed (6 Known Flaky), 114875 Passed, 800 Skipped, 3h 16m 18.45s Total Time

:x: Failed Tests (6)

This report shows up to 5 failed tests.

  • test_flask_packages_not_patched[pyarrow] - test_packages.py - :snowflake: Known flaky - Details

    Expand for error
    ssert False
    +  where False = <built-in method startswith of str object at 0x7f22418da920>("Table data: {'column1': {0: 'some-value'}, 'column2': {0: 1}}")
    +    where <built-in method startswith of str object at 0x7f22418da920> = 'Error: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject'.startswith
    +    and   "Table data: {'column1': {0: 'some-value'}, 'column2': {0: 1}}" = pyarrow==16.1.0: .expected_result1
    
  • test_flask_packages_not_patched[pyarrow] - test_packages.py - :snowflake: Known flaky - Details

    Expand for error
    ssert False
    +  where False = <built-in method startswith of str object at 0x7fb91ce9cb20>("Table data: {'column1': {0: 'some-value'}, 'column2': {0: 1}}")
    +    where <built-in method startswith of str object at 0x7fb91ce9cb20> = "Error: No module named 'pandas'".startswith
    +    and   "Table data: {'column1': {0: 'some-value'}, 'column2': {0: 1}}" = pyarrow==16.1.0: .expected_result1
    
  • test_flask_packages_not_patched[pyarrow] - test_packages.py - :snowflake: Known flaky - Details

    Expand for error
    ssert False
    +  where False = <built-in method startswith of str object at 0x7fea6e982880>("Table data: {'column1': {0: 'some-value'}, 'column2': {0: 1}}")
    +    where <built-in method startswith of str object at 0x7fea6e982880> = "Error: No module named 'pandas'".startswith
    +    and   "Table data: {'column1': {0: 'some-value'}, 'column2': {0: 1}}" = pyarrow==16.1.0: .expected_result1
    
  • test_flask_packages_patched[pyarrow] - test_packages.py - :snowflake: Known flaky - Details

    Expand for error
    ssert False
    +  where False = <built-in method startswith of str object at 0x7fea6eaa2ba0>("Table data: {'column1': {0: 'some-value'}, 'column2': {0: 1}}")
    +    where <built-in method startswith of str object at 0x7fea6eaa2ba0> = "Error: No module named 'pandas'".startswith
    +    and   "Table data: {'column1': {0: 'some-value'}, 'column2': {0: 1}}" = pyarrow==16.1.0: .expected_result1
    
  • test_flask_packages_patched[pyarrow] - test_packages.py - :snowflake: Known flaky - Details

    Expand for error
    ssert False
    +  where False = <built-in method startswith of str object at 0x7f223bc9d240>("Table data: {'column1': {0: 'some-value'}, 'column2': {0: 1}}")
    +    where <built-in method startswith of str object at 0x7f223bc9d240> = 'Error: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject'.startswith
    +    and   "Table data: {'column1': {0: 'some-value'}, 'column2': {0: 1}}" = pyarrow==16.1.0: .expected_result1
    

Benchmarks

Benchmark execution time: 2024-07-17 16:36:02

Comparing candidate commit 91bb4c069cdea4430efe220892ca29b4da4751fb in PR branch romain.komorn/chore/enforce_log_telemetry_levels with baseline commit 6a280ee93c8c358458be8ac225256f6f692b2479 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 214 metrics, 2 unstable metrics.

pr-commenter[bot] avatar Jun 12 '24 11:06 pr-commenter[bot]

Codecov Report

Attention: Patch coverage is 13.04348% with 20 lines in your changes missing coverage. Please review.

Project coverage is 10.52%. Comparing base (c728c68) to head (ff848a7). Report is 17 commits behind head on main.

Files Patch % Lines
tests/telemetry/test_telemetry_metrics.py 0.00% 10 Missing :warning:
ddtrace/internal/telemetry/constants.py 0.00% 5 Missing :warning:
ddtrace/_trace/processor/__init__.py 0.00% 3 Missing :warning:
ddtrace/appsec/_iast/_metrics.py 50.00% 1 Missing :warning:
ddtrace/internal/telemetry/writer.py 0.00% 1 Missing :warning:
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #9520       +/-   ##
===========================================
- Coverage   74.30%   10.52%   -63.79%     
===========================================
  Files        1398     1366       -32     
  Lines      129930   127754     -2176     
===========================================
- Hits        96541    13441    -83100     
- Misses      33389   114313    +80924     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Jun 12 '24 11:06 codecov-commenter

This and #9349 are going to conflict with one another depending who merges first (#9349 will have to get updated since it uses add_log() as well).

romainkomorn-exdatadog avatar Jun 17 '24 14:06 romainkomorn-exdatadog

Datadog Report

Branch report: romain.komorn/chore/enforce_log_telemetry_levels Commit report: da64aa7 Test service: dd-trace-py

:white_check_mark: 0 Failed, 171858 Passed, 1827 Skipped, 11h 7m 13.76s Total duration (5h 54m 17.21s time saved) :snowflake: 1 New Flaky

New Flaky Tests (1)

  • test_otel_trace_across_fork - test_context.py - Last Failure

    Expand for error
    t request <Request GET /test/session/snapshot >:
      At snapshot (token='tests.opentelemetry.test_context.test_otel_trace_across_fork'):
       - Directory: /snapshots
       - CI mode: 0
       - Trace File: /snapshots/tests.opentelemetry.test_context.test_otel_trace_across_fork.json
       - Stats File: /snapshots/tests.opentelemetry.test_context.test_otel_trace_across_fork_tracestats.json
       At compare of 1 expected trace(s) to 1 received trace(s):
        At trace 'internal' (2 spans):
    eceived fewer spans (1) than expected (2). Expected unmatched spans: 'internal'
    

Datadog Report

Branch report: romain.komorn/chore/enforce_log_telemetry_levels Commit report: ed1d6dd Test service: dd-trace-py

:white_check_mark: 0 Failed, 173970 Passed, 1873 Skipped, 7h 57m 43.09s Total duration (54m 41.34s time saved) :hourglass: 1 Performance Regression

:hourglass: Performance Regressions vs Default Branch (1)

  • test_data_streams_payload_size[key_and_length0-payload_and_length0] - test_kafka.py 1m 30.11s (+1m 26.62s, +2485%) - Details

CODEOWNERS have been resolved as:

ddtrace/_trace/processor/__init__.py                                    @DataDog/apm-core-python
ddtrace/appsec/_iast/_metrics.py                                        @DataDog/asm-python
ddtrace/appsec/_metrics.py                                              @DataDog/asm-python
ddtrace/internal/telemetry/constants.py                                 @DataDog/apm-core-python
ddtrace/internal/telemetry/writer.py                                    @DataDog/apm-core-python
tests/telemetry/test_telemetry_metrics.py                               @DataDog/apm-core-python

github-actions[bot] avatar Jul 16 '24 17:07 github-actions[bot]