ipmi_exporter icon indicating copy to clipboard operation
ipmi_exporter copied to clipboard

more and more process of ipmi-sensors when multiple prometheus scrape from the same ipmi-exporter

Open chris-fengtian-guo opened this issue 3 years ago • 3 comments

on some hardware have such issue but not all I meet the issue. because the ipmi-sensors execute slow and when there are multiple prometheus scape from the same ipmi-exporter. more and more process of execute of ipmi-sensors command and each ipmi-sensors commands take more and more time to finish. the prometheus scape_interval is 5 second just 10 miniutes later after ipmi-exporter start there are 44 ipmi-sensors process. and I have to jump from this issue by configure prometheus's configure. by increase scrape_interval from 5 seconds to 10 seconds . and this issue less worse but not fixed. when multiple prometheus scape from the same ipmi-exporter

and 20 miniutes later there are 134 ipmi-sensors [root@control01 fengtian]# date Thu Dec 30 15:49:27 CST 2021 [root@control01 fengtian]# ps auxw|grep ipmi-sensors|wc -l 134

  1. I add following drfit patch to fix such issue and my patch is base on and sorry I didn't base on the latest code of master

commit d83b49ac6cf6cd54e1f88b115ed5989f24896b30 Merge: 178b38f 5811015 Author: Conrad Hoffmann [email protected] Date: Thu Jun 24 17:45:33 2021 +0200 Merge pull request #76 from xibolun/master
Update doc/configuration.md

1.2) this is my drift patch by add ia cache for output of pmi-sensors command and when there are more than one ipmi-sensors process exist I use the cache.

[fengtian@hyper1 ipmi_exporter_git]$ git show commit b8d258a6376e3b1758a12f8551642ee501ec7fb8 (HEAD -> fguo_build) Author: Guo fengtian [email protected] Date: Thu Dec 30 14:36:21 2021 +0800

add cache for ipmi-sensors command output

Because some hardware the ipmi-sensors spend much more time than
1 second.
so when multiple prometheus scape from the same ipmi exporter,
each ipmi-sensors take more and more time to run. and there are
more and more process of ipmi-sensors at the same.
some times each ipmi-sensors command take more than 20 miniutes to
finish

so the solution is add the cache of output of ipmi-sensors to avoid
more and more ipmi-sensors process run in same time

Signed-off-by: Guo fengtian <[email protected]>

diff --git a/collector.go b/collector.go index f8c777b..f852f04 100644 --- a/collector.go +++ b/collector.go @@ -1,7 +1,9 @@ package main

import (

  •   "os/exec"
      "path"
    
  •   "strings"
      "time"
    
      "github.com/prometheus/client_golang/prometheus"
    

@@ -49,6 +51,8 @@ var ( ) )

+var ipmi_cache freeipmi.Result + // Describe implements Prometheus.Collector. func (c metaCollector) Describe(ch chan<- *prometheus.Desc) { // all metrics are described ad-hoc @@ -63,9 +67,26 @@ func markCollectorUp(ch chan<- prometheus.Metric, name string, up int) { ) }

+func checkIpmiProcRun(cmd string) bool {

  •   //cmd := "pgrep ipmi-sensors"
    
  •   ret, err := exec.Command("/bin/sh", "-c", cmd).Output()
    
  •   log.Debugf("check cmd=%s, result=%s", cmd, string(ret))
    
  •   if err != nil {
    
  •           return false
    
  •   }
    
  •   arr := strings.Split(string(ret), "\n")
    
  •   log.Debugf("string array len=%d, array=%v", len(arr), arr)
    
  •   if len(arr) > 2 {
    
  •           return true
    
  •   }
    
  •   return false
    

+} + // Collect implements Prometheus.Collector. func (c metaCollector) Collect(ch chan<- prometheus.Metric) { start := time.Now() +

  •   log.Debugf("FGUO begin collector")
      defer func() {
              duration := time.Since(start).Seconds()
              log.Debugf("Scrape of target %s took %f seconds.", targetName(c.target), duration)
    

@@ -84,14 +105,30 @@ func (c metaCollector) Collect(ch chan<- prometheus.Metric) {

    for _, collector := range config.GetCollectors() {
            var up int
  •           var result freeipmi.Result
              log.Debugf("Running collector: %s", collector.Name())
    
  •           log.Debugf("collector.name: %s", collector.Cmd())
    
              fqcmd := path.Join(*executablesPath, collector.Cmd())
              args := collector.Args()
              cfg := config.GetFreeipmiConfig()
    
  •           result := freeipmi.Execute(fqcmd, args, cfg, target.host, log.Base())
    
  •           ipmi_res := checkIpmiProcRun("pgrep ipmi-sensors")
    
  •           log.Debugf("checkIpmiProcRun=%v", ipmi_res)
    
  •           if collector.Name() == "ipmi" && ipmi_res {
    
  •                   //we cached ipmimonitor command as it execute very slow on some hardware
    
  •                   result = ipmi_cache
    
  •                   log.Debugf("ipmimonitor use cache\n")
    
  •           } else {
    
  •                   result = freeipmi.Execute(fqcmd, args, cfg, target.host, log.Base())
    
  •                   /*
    
  •                           if result.err != nil {
    
  •                                   log.Debugf("freeipmi.Exec result=%s\n", string(result.output))
    
  •                           }
    
  •                   */
    
  •                   if collector.Name() == "ipmi" {
    
  •                           ipmi_cache = result
    
  •                   }
    
  •           }
              up, _ = collector.Collect(result, ch, target)
              markCollectorUp(ch, string(collector.Name()), up)
      }
    

[root@control01 ~]# uname -a Linux control01 5.10.44-1 #1 SMP Thu Jun 17 05:47:40 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

1.2) example configure of promtheus. the prometheus [root@hc-1 ~]# cat /etc/prometheus/prometheus.yml global: scrape_interval: 5s evaluation_interval: 5s

rule_files:

  • "vm.rules"
  • "host.rules"
  • "cluster.rules"

scrape_configs:

  • job_name: host scrape_interval: 5s scrape_timeout: 5s file_sd_configs:
    • files:
      • host_target.yml refresh_interval: 5s relabel_configs:
    • source_labels: ['address'] separator: ':' regex: '(.):.' target_label: 'instance' replacement: '${1}' [root@hc-1 ~]# cat /etc/prometheus/host_target.yml
  • targets:
    • 192.168.182.155:9290
    • 192.168.182.155:9000
    • 192.168.182.155:9100
    • 192.168.182.155:9475
    • 192.168.182.156:9290
    • 192.168.182.156:9000
    • 192.168.182.156:9100
    • 192.168.182.156:9475
    • 192.168.182.157:9290
    • 192.168.182.157:9000
    • 192.168.182.157:9100
    • 192.168.182.157:9475 labels: group: '2'

[root@control01 ~]# date;ipmimonitoring -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask ;date Thu Dec 30 15:29:55 CST 2021 1,Inlet Temp,Temperature,Nominal,27.00,C,00C0h 2,Outlet Temp,Temperature,Nominal,35.00,C,00C0h 3,PCH Temp,Temperature,Nominal,53.00,C,00C0h 4,CPU1 Core Rem,Temperature,Nominal,40.00,C,00C0h 5,CPU2 Core Rem,Temperature,Nominal,40.00,C,00C0h 6,CPU1 DTS,Temperature,Nominal,-60.00,unspecified,00C0h 7,CPU2 DTS,Temperature,Nominal,-60.00,unspecified,00C0h 8,Cpu1 Margin,Temperature,Nominal,-49.00,unspecified,00C0h 9,Cpu2 Margin,Temperature,Nominal,-50.00,unspecified,00C0h 10,CPU1 MEM Temp,Temperature,Nominal,35.00,C,00C0h 11,CPU2 MEM Temp,Temperature,Nominal,32.00,C,00C0h 12,SYS 3.3V,Voltage,Nominal,3.30,V,00C0h 13,SYS 5V,Voltage,Nominal,5.13,V,00C0h 14,SYS 12V_1,Voltage,Nominal,12.18,V,00C0h 15,SYS 12V_2,Voltage,Nominal,12.12,V,00C0h 16,CPU1 DDR VPP1,Voltage,Nominal,2.54,V,00C0h 17,CPU1 DDR VPP2,Voltage,Nominal,2.54,V,00C0h 18,CPU2 DDR VPP1,Voltage,Nominal,2.54,V,00C0h 19,CPU2 DDR VPP2,Voltage,Nominal,2.52,V,00C0h 20,FAN1 Speed,Fan,Nominal,5640.00,RPM,00C0h 21,FAN2 Speed,Fan,Nominal,5640.00,RPM,00C0h 22,FAN3 Speed,Fan,Nominal,5640.00,RPM,00C0h 23,FAN4 Speed,Fan,Nominal,5760.00,RPM,00C0h 24,Power,Other Units Based Sensor,Nominal,252.00,W,00C0h 25,Disks Temp,Temperature,Nominal,31.00,C,00C0h 26,RAID Temp,Temperature,Nominal,48.00,C,00C0h 27,Raid BBU Temp,Temperature,Nominal,28.00,C,00C0h 28,Power1,Other Units Based Sensor,Nominal,120.00,W,00C0h 29,PS1 VIN,Voltage,Nominal,218.00,V,00C0h 30,PS1 Inlet Temp,Temperature,Nominal,27.00,C,00C0h 31,Power2,Other Units Based Sensor,Nominal,132.00,W,00C0h 32,PS2 VIN,Voltage,Nominal,218.00,V,00C0h 33,PS2 Inlet Temp,Temperature,Nominal,23.00,C,00C0h 34,CPU1 VCore,Voltage,Nominal,1.78,V,00C0h 35,CPU2 VCore,Voltage,Nominal,1.78,V,00C0h 36,CPU1 DDR VDDQ,Voltage,Nominal,1.22,V,00C0h 37,CPU1 DDR VDDQ2,Voltage,Nominal,1.22,V,00C0h 38,CPU2 DDR VDDQ,Voltage,Nominal,1.22,V,00C0h 39,CPU2 DDR VDDQ2,Voltage,Nominal,1.23,V,00C0h 40,CPU1 VDDQ Temp,Temperature,Nominal,34.00,C,00C0h 41,CPU2 VDDQ Temp,Temperature,Nominal,33.00,C,00C0h 42,CPU1 VRD Temp,Temperature,Nominal,41.00,C,00C0h 43,CPU2 VRD Temp,Temperature,Nominal,40.00,C,00C0h 44,CPU1 VSA,Voltage,Nominal,0.87,V,00C0h 45,CPU2 VSA,Voltage,Nominal,0.85,V,00C0h 46,CPU1 VCCIO,Voltage,Nominal,0.99,V,00C0h 47,CPU2 VCCIO,Voltage,Nominal,0.99,V,00C0h 48,PCH VPVNN,Voltage,Nominal,0.99,V,00C0h 49,PCH PRIM 1V05,Voltage,Nominal,1.04,V,00C0h 50,P4GPU4 Temp,Temperature,Nominal,47.00,C,00C0h 51,LOM P1 Link Down,Slot/Connector,Nominal,N/A,N/A,0000h 52,LOM P2 Link Down,Slot/Connector,Nominal,N/A,N/A,0000h 53,LOM P3 Link Down,Slot/Connector,Nominal,N/A,N/A,0000h 54,LOM P4 Link Down,Slot/Connector,Nominal,N/A,N/A,0000h 55,PCH Status,Chip Set,N/A,N/A,N/A,0000h 56,CPU1 UPI Link,Cable/Interconnect,Nominal,N/A,N/A,0000h 57,CPU1 Prochot,Processor,Nominal,N/A,N/A,0000h 58,CPU2 UPI Link,Cable/Interconnect,Nominal,N/A,N/A,0000h 59,CPU2 Prochot,Processor,Nominal,N/A,N/A,0000h 60,System Notice,System Event,N/A,N/A,N/A,0000h 61,System Error,System Event,Nominal,N/A,N/A,0000h 62,CPU1 Status,Processor,Nominal,N/A,N/A,0080h 63,CPU2 Status,Processor,Nominal,N/A,N/A,0080h 64,CPU1 Memory,Memory,Nominal,N/A,N/A,0000h 65,CPU2 Memory,Memory,Nominal,N/A,N/A,0000h 66,FAN1 Status,Slot/Connector,Nominal,N/A,N/A,0000h 67,FAN2 Status,Slot/Connector,Nominal,N/A,N/A,0000h 68,FAN3 Status,Slot/Connector,Nominal,N/A,N/A,0000h 69,FAN4 Status,Slot/Connector,Nominal,N/A,N/A,0000h 70,DIMM000,Memory,Nominal,N/A,N/A,0040h 71,DIMM001,Memory,Nominal,N/A,N/A,0000h 72,DIMM010,Memory,Nominal,N/A,N/A,0040h 73,DIMM011,Memory,Nominal,N/A,N/A,0000h 74,DIMM020,Memory,Nominal,N/A,N/A,0000h 75,DIMM021,Memory,Nominal,N/A,N/A,0000h 76,DIMM030,Memory,Nominal,N/A,N/A,0000h 77,DIMM031,Memory,Nominal,N/A,N/A,0000h 78,DIMM040,Memory,Nominal,N/A,N/A,0000h 79,DIMM041,Memory,Nominal,N/A,N/A,0000h 80,DIMM050,Memory,Nominal,N/A,N/A,0000h 81,DIMM051,Memory,Nominal,N/A,N/A,0000h 82,DIMM100,Memory,Nominal,N/A,N/A,0040h 83,DIMM101,Memory,Nominal,N/A,N/A,0000h 84,DIMM110,Memory,Nominal,N/A,N/A,0040h 85,DIMM111,Memory,Nominal,N/A,N/A,0000h 86,DIMM120,Memory,Nominal,N/A,N/A,0000h 87,DIMM121,Memory,Nominal,N/A,N/A,0000h 88,DIMM130,Memory,Nominal,N/A,N/A,0000h 89,DIMM131,Memory,Nominal,N/A,N/A,0000h 90,DIMM140,Memory,Nominal,N/A,N/A,0000h 91,DIMM141,Memory,Nominal,N/A,N/A,0000h 92,DIMM150,Memory,Nominal,N/A,N/A,0000h 93,DIMM151,Memory,Nominal,N/A,N/A,0000h 94,RTC Battery,Battery,Nominal,N/A,N/A,0000h 95,PCIE Status,Slot/Connector,Nominal,N/A,N/A,0000h 96,ACPI State,System ACPI Power State,Nominal,N/A,N/A,0001h 97,SysFWProgress,System Firmware Progress,Nominal,N/A,N/A,0000h 98,Power Button,Button/Switch,Nominal,N/A,N/A,0000h 99,SysRestart,System Boot Initiated,N/A,N/A,N/A,0080h 100,Boot Error,Boot Error,Nominal,N/A,N/A,0000h 101,Watchdog2,Watchdog 2,Nominal,N/A,N/A,0000h 102,Mngmnt Health,Management Subsystem Health,Nominal,N/A,N/A,0000h 103,UID Button,Button/Switch,Nominal,N/A,N/A,0000h 104,PwrOk Sig. Drop,Power Supply,Nominal,N/A,N/A,0000h 105,PwrOn TimeOut,Power Supply,Nominal,N/A,N/A,0000h 106,PwrCap Status,System Event,N/A,N/A,N/A,0000h 107,HDD Backplane,Cable/Interconnect,Nominal,N/A,N/A,0000h 108,HDD BP Status,Module/Board,N/A,N/A,N/A,0000h 109,Riser1 Card,Add In Card,N/A,N/A,N/A,0000h 110,Riser2 Card,Add In Card,N/A,N/A,N/A,0000h 111,Riser3 Card,Add In Card,N/A,N/A,N/A,0000h 112,SAS Cable,Cable/Interconnect,Nominal,N/A,N/A,0000h 113,FAN1 Presence,Cooling Device,N/A,N/A,N/A,0000h 114,FAN2 Presence,Cooling Device,N/A,N/A,N/A,0000h 115,FAN3 Presence,Cooling Device,N/A,N/A,N/A,0000h 116,FAN4 Presence,Cooling Device,N/A,N/A,N/A,0000h 117,RAID Presence,Add In Card,N/A,N/A,N/A,0002h 118,CPU Usage,Processor,N/A,N/A,N/A,0000h 119,Memory Usage,Memory,N/A,N/A,N/A,0000h 120,LCD Status,Terminator,N/A,N/A,N/A,0000h 121,LCD Presence,Terminator,N/A,N/A,N/A,0001h 122,PS Redundancy,Power Supply,Nominal,N/A,N/A,0000h 123,BMC Boot Up,Microcontroller/Coprocessor,N/A,N/A,N/A,0002h 124,BMC Time Hopping,Microcontroller/Coprocessor,N/A,N/A,N/A,0000h 125,NTP Sync Failed,Microcontroller/Coprocessor,N/A,N/A,N/A,0000h 126,SEL Status,Event Logging Disabled,Nominal,N/A,N/A,0000h 127,Op. Log Full,Event Logging Disabled,N/A,N/A,N/A,0000h 128,Sec. Log Full,Event Logging Disabled,N/A,N/A,N/A,0000h 129,Host Loss,Slot/Connector,Nominal,N/A,N/A,0000h 130,NIC1 Presence,Add In Card,N/A,N/A,N/A,0002h 131,RAID Status,Add In Card,N/A,N/A,N/A,0000h 132,RAID PCIE ERR,Add In Card,N/A,N/A,N/A,0000h 133,RAID Card BBU,Battery,Nominal,N/A,N/A,0004h 134,DISK0,Drive Slot,Nominal,N/A,N/A,0001h 135,DISK1,Drive Slot,Nominal,N/A,N/A,0001h 136,DISK2,Drive Slot,Nominal,N/A,N/A,0000h 137,DISK3,Drive Slot,Nominal,N/A,N/A,0000h 138,DISK4,Drive Slot,Nominal,N/A,N/A,0000h 139,DISK5,Drive Slot,Nominal,N/A,N/A,0000h 140,DISK6,Drive Slot,Nominal,N/A,N/A,0000h 141,DISK7,Drive Slot,Nominal,N/A,N/A,0000h 142,NIC1 Status,Add In Card,N/A,N/A,N/A,0000h 143,Port1 Link Down,Slot/Connector,Warning,N/A,N/A,0100h 144,Port2 Link Down,Slot/Connector,Warning,N/A,N/A,0100h 145,PS1 Status,Power Supply,Nominal,N/A,N/A,0001h 146,PS1 Fan Status,Slot/Connector,Nominal,N/A,N/A,0000h 147,PS1 Temp Status,Slot/Connector,Nominal,N/A,N/A,0000h 148,PS2 Status,Power Supply,Nominal,N/A,N/A,0001h 149,PS2 Fan Status,Slot/Connector,Nominal,N/A,N/A,0000h 150,PS2 Temp Status,Slot/Connector,Nominal,N/A,N/A,0000h Thu Dec 30 15:31:12 CST 2021 [root@control01 ~]#

[root@control01 ~]# ps auxw|grep ipmi root 121032 0.3 0.0 718368 25508 ? Sl 14:57 0:06 /usr/local/bin/ipmi_exporter --web.listen-address=:9290 --config.file=/etc/ipmi_local.yml root 163339 0.0 0.0 41320 7656 ? S 15:31 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-45448a822dfaa7be784fe2dd0fa6a103 --ignore-not-available-sensors root 163370 0.0 0.0 41316 7468 ? S 15:31 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-47d530d6fe172ec843261ddd2e44f990 --ignore-not-available-sensors root 163401 0.0 0.0 41320 7396 ? S 15:31 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-b4a58f4a8c9cc00f043f566e1029dc28 --ignore-not-available-sensors root 163504 0.0 0.0 41324 7644 ? S 15:31 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-2d196acf67dfa167e371b637d5fd071c --ignore-not-available-sensors root 163533 0.0 0.0 41320 7532 ? S 15:31 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-dc73fea7dc1cd7ae41efadb3641c6676 --ignore-not-available-sensors root 163568 0.0 0.0 41320 7420 ? S 15:31 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-44afb5093d39703e142e5dc29f21cd0a --ignore-not-available-sensors root 163597 0.0 0.0 41320 7556 ? S 15:31 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-bbc102922fbd093450d832c8f02a9884 --ignore-not-available-sensors root 163662 0.0 0.0 41324 7468 ? S 15:31 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-5180f15f6379f0597cd05e2621fb8e00 --ignore-not-available-sensors root 163691 0.0 0.0 41316 7368 ? S 15:31 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-bf173761b35d2919b0531132792ad715 --ignore-not-available-sensors root 163723 0.0 0.0 41320 7472 ? S 15:31 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-7385320570b15caed3a079dbd39a4a64 --ignore-not-available-sensors root 163752 0.0 0.0 41316 7552 ? S 15:31 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-dec6f299d54aef55541e86d4644793d8 --ignore-not-available-sensors root 163815 0.0 0.0 41320 7496 ? S 15:31 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-e880c2c612430010c015436f64f3f397 --ignore-not-available-sensors root 163843 0.0 0.0 41320 7620 ? S 15:31 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-37d01724636975d71b42c2dab470de19 --ignore-not-available-sensors root 163874 0.0 0.0 41324 7644 ? S 15:31 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-f78289da13c57c8bcb33a1ae82cd217c --ignore-not-available-sensors root 163909 0.0 0.0 41320 7380 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-43906b243ffe1b3dbe9b29d5b7174e76 --ignore-not-available-sensors root 163981 0.0 0.0 41320 6472 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-5cda14055650c0eb73c8fa11fdddfe20 --ignore-not-available-sensors root 164010 0.0 0.0 41320 7464 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-8634458eecefe164c0ee5823d2f05201 --ignore-not-available-sensors root 164052 0.0 0.0 41320 7396 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-339701a79d61822aec2b049629745d52 --ignore-not-available-sensors root 164080 0.0 0.0 41320 7556 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-929d18ceda31b794800d0b525316a809 --ignore-not-available-sensors root 164145 0.0 0.0 41316 7308 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-a5d07f5b53de2ffb9d5fc34b2367be73 --ignore-not-available-sensors root 164174 0.0 0.0 41316 6640 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-ee736d6146539bfbc8d6adfdb6a0044d --ignore-not-available-sensors root 164208 0.0 0.0 41320 7444 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-72c6c6814b2af351dc80da0bdacb76e4 --ignore-not-available-sensors root 164238 0.0 0.0 41316 7460 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-4711cfbc13b4e2f35b19b77a69315d22 --ignore-not-available-sensors root 164395 0.0 0.0 41320 7576 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-70dbb9c675cce84adee14f8400218460 --ignore-not-available-sensors root 164425 0.0 0.0 41316 7572 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-acf45e5746c9b28ac5899381b7579085 --ignore-not-available-sensors root 164457 0.0 0.0 41316 7592 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-d4ed68c84adf1985523fc21f8ee20030 --ignore-not-available-sensors root 164488 0.0 0.0 41324 7420 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-a6b0d45098d0dad42432b16fc46dff23 --ignore-not-available-sensors root 164534 0.0 0.0 41320 7388 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-1298ce54d4139dfb77522ccc55d298a9 --ignore-not-available-sensors root 164617 0.0 0.0 41316 7384 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-772ac90db1a4064006b28a5b23ac7e3d --ignore-not-available-sensors root 164652 0.1 0.0 41320 7396 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-b906ff690f19103eb9dc062b39c50859 --ignore-not-available-sensors root 164678 0.1 0.0 41316 7400 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-d66596e53b10859e9589755796666ed2 --ignore-not-available-sensors root 164720 0.1 0.0 41324 7420 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-97a3abd29bd20c86d5984e818a8cb919 --ignore-not-available-sensors root 164772 0.0 0.0 41316 7592 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-6191424fa164b793d0365025c687680d --ignore-not-available-sensors root 164805 0.1 0.0 41320 7496 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-dd5ea16aeb1aa2640320a17b56da2c32 --ignore-not-available-sensors root 164834 0.1 0.0 41320 7432 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-0abd4dd687a33c3c4baf6c288802b82f --ignore-not-available-sensors root 164876 0.1 0.0 41320 7464 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-b420215876af7fd716d41ff278bb562c --ignore-not-available-sensors root 164928 0.1 0.0 41320 7420 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-790e913dd22cb70a14f1175de70ee502 --ignore-not-available-sensors root 164960 0.3 0.0 41316 7652 ? S 15:32 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-2da010cb9a220161b10df5f476dedd30 --ignore-not-available-sensors root 164991 0.2 0.0 41316 7384 ? S 15:33 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-60e4755e190cafd2dceadf55a0fc53a5 --ignore-not-available-sensors root 165034 0.4 0.0 41316 7592 ? S 15:33 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-4b6cbfda5fe121e4d86a6451f9850cd6 --ignore-not-available-sensors root 165084 0.6 0.0 41316 7384 ? S 15:33 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-f7765703c83e509b19a7dee2f6c1266d --ignore-not-available-sensors root 165118 0.0 0.0 41204 6460 ? S 15:33 0:00 /usr/sbin/ipmi-sensors --output-sensor-state -Q --ignore-unrecognized-events --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-bitmask --config-file /tmp/ipmi_exporter-8a42354bae559ce7214faa42fb1c6e7f --ignore-not-available-sensors root 165124 0.0 0.0 221908 1000 pts/10 S+ 15:33 0:00 grep --color=auto ipmi [root@control01 ~]# ps auxw|grep ipmi|wc -l 44

[root@control01 ~]# netstat -antp|grep ipmi tcp6 0 0 :::9290 :::* LISTEN 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.181:45896 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57402 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.181:46010 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57438 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.181:46092 ESTABLISHED 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.181:45994 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57272 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.181:46058 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57256 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57454 ESTABLISHED 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57288 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.181:45944 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.181:45960 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.181:45876 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57320 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.181:45912 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.181:45978 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57220 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57254 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57368 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.181:45928 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57336 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.181:45894 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57386 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57238 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57204 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.181:45844 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.181:46042 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57422 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.181:46074 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.181:46026 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57352 CLOSE_WAIT 121032/ipmi_exporte tcp6 0 0 192.168.182.155:9290 192.168.185.180:57304 CLOSE_WAIT 121032/ipmi_exporte

chris-fengtian-guo avatar Dec 30 '21 08:12 chris-fengtian-guo

The hardware info

[root@control01 ~]# dmidecode |more

dmidecode 3.2

Getting SMBIOS data from sysfs. SMBIOS 3.0.0 present. Table at 0x6F8EC000.

Handle 0x0000, DMI type 0, 24 bytes BIOS Information Vendor: INSYDE Corp. Version: 0.99 Release Date: 11/14/2018 Address: 0xE0000 Runtime Size: 128 kB ROM Size: 16 MB Characteristics: PCI is supported BIOS is upgradeable BIOS shadowing is allowed Boot from CD is supported Selectable boot is supported EDD is supported Japanese floppy for NEC 9800 1.2 MB is supported (int 13h) Japanese floppy for Toshiba 1.2 MB is supported (int 13h) 5.25"/360 kB floppy services are supported (int 13h) 5.25"/1.2 MB floppy services are supported (int 13h) 3.5"/720 kB floppy services are supported (int 13h) 3.5"/2.88 MB floppy services are supported (int 13h) 8042 keyboard services are supported (int 9h) CGA/mono video services are supported (int 10h) ACPI is supported USB legacy is supported BIOS boot specification is supported Targeted content distribution is supported UEFI is supported BIOS Revision: 0.220

Handle 0x0001, DMI type 1, 27 bytes System Information Manufacturer: Huawei Product Name: 2288H V5 Version: Purley Serial Number: 2102311TXH6TJC001649 UUID: 85e2d65e-7cc3-bf6b-e911-970694603fb6 Wake-up Type: Power Switch SKU Number: Type1Sku0 Family: Type1Family

Handle 0x0002, DMI type 2, 15 bytes Base Board Information Manufacturer: Huawei Product Name: BC11SPSCB0 Version: V100R005 Serial Number: 024AFQ6TJC006409 Asset Tag: To be filled by O.E.M. Features: Board is a hosting board Board is replaceable Location In Chassis: Type2 - Board Chassis Location Chassis Handle: 0x0003 Type: Motherboard Contained Object Handles: 0

Handle 0x0003, DMI type 3, 25 bytes Chassis Information Manufacturer: Huawei Type: Main Server Chassis Lock: Not Present Version: To be filled by O.E.M. Serial Number: To be filled by O.E.M. Asset Tag: To be filled by O.E.M. Boot-up State: Safe Power Supply State: Safe Thermal State: Safe Security Status: None OEM Information: 0x00000000 Height: 2 U Number Of Power Cords: 1 Contained Elements: 0 SKU Number: Not Specified

Handle 0x0004, DMI type 4, 48 bytes Processor Information Socket Designation: CPU01 Type: Central Processor Family: Xeon Manufacturer: Intel(R) Corporation ID: 54 06 05 00 FF FB EB BF Signature: Type 0, Family 6, Model 85, Stepping 4 Flags: FPU (Floating-point unit on-chip) VME (Virtual mode extension) DE (Debugging extension) PSE (Page size extension) TSC (Time stamp counter) MSR (Model specific registers) PAE (Physical address extension) MCE (Machine check exception) CX8 (CMPXCHG8 instruction supported) APIC (On-chip APIC hardware supported) SEP (Fast system call) MTRR (Memory type range registers) PGE (Page global enable) MCA (Machine check architecture) CMOV (Conditional move instruction supported) PAT (Page attribute table) PSE-36 (36-bit page size extension) CLFSH (CLFLUSH instruction supported) DS (Debug store) ACPI (ACPI supported) MMX (MMX technology supported) FXSR (FXSAVE and FXSTOR instructions supported) SSE (Streaming SIMD extensions) SSE2 (Streaming SIMD extensions 2) SS (Self-snoop) HTT (Multi-threading) TM (Thermal monitor supported) PBE (Pending break enabled) Version: Intel(R) Xeon(R) Gold 6134 CPU @ 3.20GHz

chris-fengtian-guo avatar Dec 30 '21 08:12 chris-fengtian-guo

Hello there. Sorry for the late reply. I think using a 5 second scrape interval is very ambitious for IPMI. Note that all collectors are run in serial. How many collectors do you run?

Also, can you post your patch again? The formatting unfortunately makes it impossible to read.

bitfehler avatar Feb 21 '22 20:02 bitfehler

Hello.

Thanks for your reply and suggestion.

and hope the following info can help

  1. I have attach my draft patch

  2. Thanks for your suggest and we increase the ipmi scrape interval from prometheus. and following is my ipmi config for local collector

  3. For all, I'm newer to both golang and ipmi . so the draft patch just fix the issue but not best. you should have wonderful solution

  4. and the drift patch not base on top of master branch git log

@.***

From: Conrad Hoffmann Date: 2022-02-22 04:05 To: prometheus-community/ipmi_exporter CC: chris-fengtian-guo; Author Subject: Re: [prometheus-community/ipmi_exporter] more and more process of ipmi-sensors when multiple prometheus scrape from the same ipmi-exporter (Issue #95) Hello there. Sorry for the late reply. I think using a 5 second scrape interval is very ambitious for IPMI. Note that all collectors are run in serial. How many collectors do you run? Also, can you post your patch again? The formatting unfortunately makes it impossible to read. — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.Message ID: @.***>

chris-fengtian-guo avatar Feb 22 '22 01:02 chris-fengtian-guo