nav
nav copied to clipboard
Add support for Juniper CHASSIS and SYSTEM alerts
Closes #2358
Codecov Report
Merging #2388 (204d406) into master (33b5913) will increase coverage by
0.08%
. The diff coverage is100.00%
.
:exclamation: Current head 204d406 differs from pull request most recent head dcff55b. Consider uploading reports for the commit dcff55b to get more accurate results
@@ Coverage Diff @@
## master #2388 +/- ##
==========================================
+ Coverage 54.52% 54.60% +0.08%
==========================================
Files 558 560 +2
Lines 40644 40709 +65
==========================================
+ Hits 22160 22231 +71
+ Misses 18484 18478 -6
Impacted Files | Coverage Δ | |
---|---|---|
python/nav/ipdevpoll/plugins/juniperalarm.py | 100.00% <100.00%> (ø) |
|
python/nav/mibs/juniper_alarm_mib.py | 100.00% <100.00%> (ø) |
... and 2 files with indirect coverage changes
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
Test results
12 files 12 suites 11m 21s :stopwatch: 3 256 tests 3 160 :heavy_check_mark: 96 :zzz: 0 :x: 9 243 runs 8 955 :heavy_check_mark: 288 :zzz: 0 :x:
Results for commit dcff55b5.
:recycle: This comment has been updated with latest results.
Currently, if a netbox has a non-zero count of red or yellow alarms, a start-event is sent. If there is a zero-count an end-event is sent. There is no checking whether a state is already open and there should be, and there is no checking of whether the specific netbox has the mib in question.
Also, tests needed.
Kudos, SonarCloud Quality Gate passed!
0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells
No Coverage information
0.0% Duplication
The actual count could possibly be stored together the event with the help of EventQueueVar. Any good examples where this is done?
eventengine will by-design ignore the end-events that appear without a corresponding start-event having been posted first
Cool, how convenient!
Kudos, SonarCloud Quality Gate passed!
0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells
No Coverage information
0.0% Duplication
The actual count could possibly be stored together the event with the help of EventQueueVar. Any good examples where this is done?
Whether this example is good could be debatable, but here is once instance of setting arbitrary event variables through the "varmap" (line 160 should be highlighted):
https://github.com/Uninett/nav/blob/e6634e512c8ecf283c85a701366620e724806ab7/python/nav/ipdevpoll/shadows/gwpeers.py#L147-L171
There are two issues that would make it difficult to come to an ideal solution:
- EventQueueVars aren't automatically carried over to the corresponding alerthist entries that eventengine generates (though I think they are copied into the alert queue - however, alert queue entries represent notifications and are removed once notifications are sent).
Usually, if you want to carry arbitrary variables over to the permanent record of alerthist/AlertHistory, you need to write an event handler plugin that does so explicitly. Currently, I think perhaps the only plugin that does so is the event plugin for maintenance events. which does it here:
https://github.com/Uninett/nav/blob/e6634e512c8ecf283c85a701366620e724806ab7/python/nav/eventengine/plugins/maintenancestate.py#L42
- Secondly, a state is a state in NAV, there isn't really a mechanism to add more events or information to an existing alerthist state. So, if the alert count changes over time (but remains non-zero), there isn't really an effective way to update an existing "juniper red alert non-zero" state, it will just go down as "oh, here's a duplicate start-event, I'll throw it away". You might, however, be able to add some magic by implementing an eventengine plugin for your new event type.
So, presently, you can generate an alert when the "red count" transitions from 0 to 1, and this alert can say "there's 1 red alert". However, when the counter subsequently transitions from 1 to 2, there is no way to notify the NAV user that "there are now 2 red alerts". Again, this is analysis is from memory. Unless it is already possible, we could jig event engine to be able to override handling of "duplicate" events in a custom plugin.
So, presently, you can generate an alert when the "red count" transitions from 0 to 1, and this alert can say "there's 1 red alert". However, when the counter subsequently transitions from 1 to 2, there is no way to notify the NAV user that "there are now 2 red alerts". Again, this is analysis is from memory. Unless it is already possible, we could jig event engine to be able to override handling of "duplicate" events in a custom plugin.
The maintenanceState plugin already suggests that we could work around the "duplicate" handling:
https://github.com/Uninett/nav/blob/e6634e512c8ecf283c85a701366620e724806ab7/python/nav/eventengine/plugins/maintenancestate.py#L43-L46
This means that we could potentially detect a change in the red/green alert count, update the existing alert history state and send an extra notification. However, there still is no good way to maintain a history/log of the changing red/green alert count over time. Maybe storing a current value and a maximum value as alerthistvars? It might be time for a fuller design discussion with the CNaaS team who wanted this feature :)
It might be time for a fuller design discussion with the CNaaS team who wanted this feature :)
So I did have a short discussion with @knutvi on this. I'm adding our conclusion to the original issue #2358.
I can confirm that every five minutes we get two logging messages about ignoring an end event for each netbox.
Kudos, SonarCloud Quality Gate passed!
0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells
No Coverage information
0.0% Duplication
Kudos, SonarCloud Quality Gate passed!
0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells
No Coverage information
0.0% Duplication
Kudos, SonarCloud Quality Gate passed!
0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells
No Coverage information
0.0% Duplication