nav
nav copied to clipboard
Integrate DHCP lease statistics
The CNaaS team wants to be able to integrate DHCP statistics into NAV.
Overview
A DHCP server can be made to summarize stats on its networks, address ranges, the current number of leases vs. the maximum number for each range. A third party script could gather these metrics and send them to NAV's Graphite server. However, NAV has no way to interpret or graph these metrics, since they didn't come from NAV. At best, you can add threshold rules on these "foreign" metrics.
We have identified three goals for a minimally viable feature:
- [ ] #2372
- [ ] #2373
- [x] #2371
- [x] #2424
- [ ] #2931
Examples
For ISC DHCP, a command line utility exists to summarize information about each configured DHCP pool: dhcpd-pools
. The command can output both human-readable tables to stdout, or as JSON data, which is excellent for a script to parse and push to Graphite.
Using the output of a dhcpd-pools command as an example (IP-ranges have been anonymized):
Ranges:
shared net name first ip last ip max cur percent touch t+c t+c perc
vlan511 w.x.y.z - w.x.y.z 239 39 16.318 0 39 16.318
vlan1120 w.x.y.z - w.x.y.z 239 76 31.799 163 239 100.000
vlan1121 w.x.y.z - w.x.y.z 239 0 0.000 8 8 3.347
vlan1100 w.x.y.z - w.x.y.z 239 77 32.218 144 221 92.469
vlan1100 w.x.y.z - w.x.y.z 254 117 46.063 137 254 100.000
vlan1100 w.x.y.z - w.x.y.z 254 76 29.921 177 253 99.606
vlan1100 w.x.y.z - w.x.y.z 254 119 46.850 133 252 99.213
vlan1160 w.x.y.z - w.x.y.z 14 0 0.000 0 0 0.000
vlan1170 w.x.y.z - w.x.y.z 27 26 96.296 0 26 96.296
Shared networks:
name max cur percent touch t+c t+c perc
vlan511 239 39 16.318 0 39 16.318
vlan1100 1001 389 38.861 591 980 97.902
vlan1120 239 76 31.799 163 239 100.000
vlan1121 239 0 0.000 8 8 3.347
vlan1160 14 0 0.000 0 0 0.000
vlan1170 27 26 96.296 0 26 96.296
Sum of all ranges:
name max cur percent touch t+c t+c perc
All networks 1759 530 30.131 762 1292 73.451
What we want to submit to Graphite are the max, cur and touch numbers for each network listed under Shared networks. The networks/pools are named after the VLAN it belongs to (which is a matter of policy, not requirement).
For this example, we might want to submit metrics like:
-
nav.dhcp.vlan511.max 239
-
nav.dhcp.vlan511.cur 39
-
nav.dhcp.vlan511.touch 0
-
nav.dhcp.vlan1100.max 1001
-
nav.dhcp.vlan1100.cur 389
-
nav.dhcp.vlan1100.touch 591
The actual IP ranges are of less importance in an MVP: As long as NAV can parse a VLAN name from level below nav.dhcp
, it can create DHCP utilization graphs in the VLAN details page: When viewing the VLAN details for VLAN 1100, NAV could find that there are DHCP metrics that match this VLAN in nav.dhcp
, and draw a graph from that.
The network names can also be something like vlan1100_some_description
, or some_description_vlan1100
, but this should still match as VLAN 1100 in NAV.
An extra level in the metric path for location may also be needed. This could in reality be any prefix configured into the integration script, something like:
-
nav.dhcp.trondheim.vlan511
-
nav.dhcp.oslo.vlan511
An extra level in the metric path for location may also be needed. This could in reality be any prefix configured into the integration script, something like:
* `nav.dhcp.trondheim.vlan511` * `nav.dhcp.oslo.vlan511`
NAV separates broadcast domains that share a common VLAN tag by using a netident
attribute (which it parses from the router port description). The most feasible way for NAV to separate DHCP pools into the correct broadcast domains is if the netident NAV knows is part of the of the DHCP pool name and subsequently encoded into the Graphite metric path.
E.g. nav.dhcp.vlan511.somenetident
to group multiple pools by VLAN number, or nav.dhcp.somenetident.vlan511
to group by netident.
Another question is how to handle the situation where there is no netident in a DHCP pool name, just the vlan tag: What metric path should be used then?
IRL discussion landed us on nav.dhcp.vlanXXX.netidentYYY
is the preferred prefix for DHCP pool stats. We should probably also support the simple case where no VLAN tags are reused, so dhcp pool names of just vlanXXX
should be logged directly under nav.dhcp.vlanXXX
.
Actions:
- [ ] Verify which characters are legal as shared network names in an ISC-DHCPD config file (eg. how to we separate vlan number from netident part of the name. Using
.
would be preferable, it matches nicely with Graphite metric levels.) @knutvi - [x] Verify that comma
,
is a valid part of a Graphite path name (since many netidents as parsed by NAV from the NTNU convention will contain one or two commas) @lunkwill42
* [x] Verify that comma `,` is a valid part of a Graphite path name (since many netidents as parsed by NAV from the NTNU convention will contain one or two commas) @lunkwill42
Commas do not seem to be valid, or at least they will interfere with the interpretation of graphite render commands if used in metric names. We will need to escape these commas somehow (standard for most special chars so far has been to either strip them or replace them with underscores)