collector
collector copied to clipboard
ROX-28557: Collector should send complete networking delta when the config is updated
Description
Currently if external IPs is disabled and then it is switched to being enabled, the unnormalized connection will be reported open at the next scrape interval, but the normalized connection will only be reported closed after the afterglow period is expired. A similar problem occurs if external IPs is enabled and then is switched to being disabled. In the above example we would like the normalized connection to be reported closed at the same time that the unnormalized connection is reported open.
To fix this, the old state is looped though after the delta is computed. If external IPs is initially disabled and then enabled all of the normalized connections will be removed from the old state and added to the delta in the closed state. If external IPs is initially enabled and then disabled, each of the connections will be checked if they are external IPs and removed from the old state and added to the delta in the closed state.
Alternatives
Setting the afterglow period to slightly less than the scrape interval when the state of the setting for external IPs is changed. This is done so that short lived connections that were opened and closed during the previous scrape interval will be reported as being open, and connections that went from being normalized to unnormalized from one scrape to the next will have the normalized version reported as being closed. Connections that were closed during previous scrape intervals, but are not passed their afterglow period will be reported as being closed. It may be a bit arbitrary that short lived connections within the previous scrape interval will not be reported closed, but connections that were closed during previous scrape intervals will be reported closed.
Clearing the old state of the connections when the runtime config is changed. This would however, result in the normalized connection not being reported as closed as it is would not be in the old or new states. This would be okay if sensor knew that the messages sent constituted a state and not a delta.
Another alternative that was considered was turning off afterglow for scrapes in which the state of the external IPs feature had changed. This didn't work, because short lived connections would be reported as closed, which is not desired.
Checklist
- [ ] Investigated and inspected CI test results
- [ ] Updated documentation accordingly
Automated testing
- [x] Added unit tests
- [ ] Added integration tests
- [ ] Added regression tests
If any of these don't apply, please comment below.
Testing Performed
TODO(replace-me) Use this space to explain how you tested your PR, or, if you didn't test it, why you did not do so. (Valid reasons include "CI is sufficient" or "No testable changes") In addition to reviewing your code, reviewers must also review your testing instructions, and make sure they are sufficient.
For more details, ref the Confluence page about this section.
Codecov Report
Attention: Patch coverage is 44.44444% with 25 lines in your changes missing coverage. Please review.
Project coverage is 28.52%. Comparing base (
215e5d4) to head (5183703).
:white_check_mark: All tests successful. No failed tests found.
Additional details and impacted files
@@ Coverage Diff @@
## master #2068 +/- ##
==========================================
+ Coverage 28.40% 28.52% +0.11%
==========================================
Files 94 94
Lines 5717 5757 +40
Branches 2531 2547 +16
==========================================
+ Hits 1624 1642 +18
- Misses 3383 3393 +10
- Partials 710 722 +12
| Flag | Coverage Δ | |
|---|---|---|
| collector-unit-tests | 28.52% <44.44%> (+0.11%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.