collector icon indicating copy to clipboard operation
collector copied to clipboard

ROX-28557: Collector should send complete networking delta when the config is updated

Open JoukoVirtanen opened this issue 8 months ago • 1 comments

Description

Currently if external IPs is disabled and then it is switched to being enabled, the unnormalized connection will be reported open at the next scrape interval, but the normalized connection will only be reported closed after the afterglow period is expired. A similar problem occurs if external IPs is enabled and then is switched to being disabled. In the above example we would like the normalized connection to be reported closed at the same time that the unnormalized connection is reported open.

To fix this, the old state is looped though after the delta is computed. If external IPs is initially disabled and then enabled all of the normalized connections will be removed from the old state and added to the delta in the closed state. If external IPs is initially enabled and then disabled, each of the connections will be checked if they are external IPs and removed from the old state and added to the delta in the closed state.

Alternatives

Setting the afterglow period to slightly less than the scrape interval when the state of the setting for external IPs is changed. This is done so that short lived connections that were opened and closed during the previous scrape interval will be reported as being open, and connections that went from being normalized to unnormalized from one scrape to the next will have the normalized version reported as being closed. Connections that were closed during previous scrape intervals, but are not passed their afterglow period will be reported as being closed. It may be a bit arbitrary that short lived connections within the previous scrape interval will not be reported closed, but connections that were closed during previous scrape intervals will be reported closed.

Clearing the old state of the connections when the runtime config is changed. This would however, result in the normalized connection not being reported as closed as it is would not be in the old or new states. This would be okay if sensor knew that the messages sent constituted a state and not a delta.

Another alternative that was considered was turning off afterglow for scrapes in which the state of the external IPs feature had changed. This didn't work, because short lived connections would be reported as closed, which is not desired.

Checklist

  • [ ] Investigated and inspected CI test results
  • [ ] Updated documentation accordingly

Automated testing

  • [x] Added unit tests
  • [ ] Added integration tests
  • [ ] Added regression tests

If any of these don't apply, please comment below.

Testing Performed

TODO(replace-me) Use this space to explain how you tested your PR, or, if you didn't test it, why you did not do so. (Valid reasons include "CI is sufficient" or "No testable changes") In addition to reviewing your code, reviewers must also review your testing instructions, and make sure they are sufficient.

For more details, ref the Confluence page about this section.

JoukoVirtanen avatar Mar 20 '25 23:03 JoukoVirtanen

Codecov Report

Attention: Patch coverage is 44.44444% with 25 lines in your changes missing coverage. Please review.

Project coverage is 28.52%. Comparing base (215e5d4) to head (5183703).

:white_check_mark: All tests successful. No failed tests found.

Files with missing lines Patch % Lines
collector/lib/ConnTracker.cpp 48.38% 5 Missing and 11 partials :warning:
collector/lib/NetworkStatusNotifier.cpp 16.66% 3 Missing and 2 partials :warning:
collector/lib/NetworkConnection.h 50.00% 2 Missing and 2 partials :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2068      +/-   ##
==========================================
+ Coverage   28.40%   28.52%   +0.11%     
==========================================
  Files          94       94              
  Lines        5717     5757      +40     
  Branches     2531     2547      +16     
==========================================
+ Hits         1624     1642      +18     
- Misses       3383     3393      +10     
- Partials      710      722      +12     
Flag Coverage Δ
collector-unit-tests 28.52% <44.44%> (+0.11%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Mar 20 '25 23:03 codecov-commenter