vmware_exporter icon indicating copy to clipboard operation
vmware_exporter copied to clipboard

[Bug]: Sensor metrics can have idential names

Open Mikle-Bond opened this issue 1 year ago • 0 comments

Symptoms

Prometheus will warn about clashing values in logs on every scrape. Example:

{"caller":"scrape.go:1744","component":"scrape manager","level":"warn","msg":"Error on ingesting samples with different value but same timestamp","num_dropped":2,"scrape_pool":"vmware","target":"http://vmware-exporter:9272/metrics?target=10.10.10.1","ts":"2024-10-03T07:30:54.384Z"}

Culprits

Looking at the output of exporter, there are metrics with the same name/labels repeated several times. Example (emphasis mine):

[...snip...]
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="igb driver 5.0.5.1",type="Software Components"} 2.0
* vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="igb driver 5.0.5.1",type="Software Components"} 2.0
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="Power Supply 1: Running/Full Power-Enabled",type="power"} 2.0
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="Power Supply 2: Running/Full Power-Enabled",type="power"} 2.0
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="System Chassis 1 Chassis Intru: General Chassis intrusion - Deassert",type="Chassis"} 2.0
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="System Chassis 1 Chassis Intru: Drive Bay intrusion - Deassert",type="Chassis"} 2.0
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="System Chassis 1 Chassis Intru: I/O Card area intrusion - Deassert",type="Chassis"} 2.0
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="System Chassis 1 Chassis Intru: Processor area intrusion - Deassert",type="Chassis"} 2.0
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="System Chassis 1 Chassis Intru: System unplugged from LAN - Deassert",type="Chassis"} 2.0
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="System Chassis 1 Chassis Intru: Unauthorized dock - Deassert",type="Chassis"} 2.0
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="System Chassis 1 Chassis Intru: FAN area intrusion - Deassert",type="Chassis"} 2.0
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="System Chassis 1 Chassis Intru: Unknown - Deassert",type="Chassis"} 2.0
* vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="System Chassis 1 Chassis Intru: Unknown - Deassert",type="Chassis"} 2.0
* vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="System Chassis 1 Chassis Intru: Unknown - Deassert",type="Chassis"} 2.0
* vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="System Chassis 1 Chassis Intru: Unknown - Deassert",type="Chassis"} 2.0
* vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="System Chassis 1 Chassis Intru: Unknown - Deassert",type="Chassis"} 2.0
* vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="System Chassis 1 Chassis Intru: Unknown - Deassert",type="Chassis"} 2.0
* vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="System Chassis 1 Chassis Intru: Unknown - Deassert",type="Chassis"} 2.0
* vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="System Chassis 1 Chassis Intru: Unknown - Deassert",type="Chassis"} 2.0
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="Power Supply 2 PS2 Status: Failure status - Deassert",type="power"} 2.0
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="Power Supply 2 PS2 Status: Predictive failure - Deassert",type="power"} 2.0
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="Power Supply 2 PS2 Status: Power Supply AC lost - Deassert",type="power"} 2.0
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="Power Supply 1 PS1 Status: Failure status - Deassert",type="power"} 2.0
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="Power Supply 1 PS1 Status: Predictive failure - Deassert",type="power"} 2.0
  vmware_host_sensor_state{cluster_name="",dc_name="ha-datacenter",host_name="node-207.work.internal",name="Power Supply 1 PS1 Status: Power Supply AC lost - Deassert",type="power"} 2.0
[...snip...]

Getting via pyVim, this is what sensor descriptions look like

   (vim.host.NumericSensorInfo) {
      dynamicType = <unset>,
      dynamicProperty = (vmodl.DynamicProperty) [],
      name = 'System Chassis 1 Chassis Intru: Unknown - Deassert',
      healthState = (vim.ElementDescription) {
         dynamicType = <unset>,
         dynamicProperty = (vmodl.DynamicProperty) [],
         label = 'Green',
         summary = 'Sensor is operating under normal conditions',
         key = 'green'
      },
      currentReading = 0L,
      unitModifier = 0,
      baseUnits = '',
      rateUnits = <unset>,
      sensorType = 'Chassis'
   },
[...repeats 7 more times...]
   (vim.host.NumericSensorInfo) {
      dynamicType = <unset>,
      dynamicProperty = (vmodl.DynamicProperty) [],
      name = 'igb driver 5.0.5.1',
      healthState = (vim.ElementDescription) {
         dynamicType = <unset>,
         dynamicProperty = (vmodl.DynamicProperty) [],
         label = 'Green',
         summary = 'Sensor is operating under normal conditions',
         key = 'green'
      },
      currentReading = 0L,
      unitModifier = 0,
      baseUnits = '',
      rateUnits = <unset>,
      sensorType = 'Software Components'
   },
[...repeats 1 more time...]
(vim.host.NumericSensorInfo) {
   dynamicType = <unset>,
   dynamicProperty = (vmodl.DynamicProperty) [],
   name = 'igb device firmware 1.2.3',
   healthState = (vim.ElementDescription) {
      dynamicType = <unset>,
      dynamicProperty = (vmodl.DynamicProperty) [],
      label = 'Green',
      summary = 'Sensor is operating under normal conditions',
      key = 'green'
   },
   currentReading = 0L,
   unitModifier = 0,
   baseUnits = '',
   rateUnits = <unset>,
   sensorType = 'Software Components'
},
[...repeats 1 more time...]

Mikle-Bond avatar Oct 03 '24 08:10 Mikle-Bond