ipmi_exporter icon indicating copy to clipboard operation
ipmi_exporter copied to clipboard

Commands for ipmi and bmc die with timeout

Open mekanix opened this issue 7 months ago • 16 comments

I have the following in my ipmi-exporter.yml

modules:
  default:
    collectors:
    - bmc
    - ipmi
    - chassis
    - sel
    - sel-events
    driver: "LAN_2_0"
    pass: "mypass"
    privilege: "user"
    user: "myuser"

This is part of my ipmi-targets.yml

- labels:
    job: ipmi_exporter
  targets:
  - host1
  - host2
  . . .

My prometheus.yml portion for this exporter is this

- file_sd_configs:
  - files:
    - /etc/prometheus/ipmi-targets.yml
    refresh_interval: 10m
  job_name: ipmi-exporter
  metrics_path: /ipmi
  params:
    module:
    - default

When I do ps ax | grep ipmi-sel I see the following

ipmi-sel --quiet-cache --comma-separated-output --no-header-output --sdr-cache-recreate --output-event-state --interpret-oem-data --entity-sensor-names --config-file /tmp/ipmi_exporter-cc2225c5535b995b1eb9a53327724295 -h host1

I thought that config file will contain user/pass, but no, that file is always zero in length. What am I doing wrong?

mekanix avatar Jun 18 '25 08:06 mekanix

This is working as intended. The config is not a regular file, it is a named pipe. Hence, it can only be read once, and it will be read by the exporter itself (edit: actually, by ipmi-sel, executed by the exporter) immediately after creation.

Please describe the actual problems you are observing, e.g. any errors you get in the logs or such.

bitfehler avatar Jun 20 '25 10:06 bitfehler

The ipmi-sel process is just stuck and dies with timeout. If I use the same command I posted in previous post and add -u <user> -P the command works. I didn't realize the config file is a pipe, so it makes sense it's zero length, thank you for explaining that.

mekanix avatar Jun 20 '25 10:06 mekanix

Is there an easy(ish) way to get data that was written to the config file (the pipe)? I'd like to manually run ipmi-sel with the same args and config that the exporter does and see what I get. Ideally, send the PR or somehow change the config once I realize what's wrong.

mekanix avatar Jun 20 '25 13:06 mekanix

Strange. And the SEL collector the only one that is not working?

bitfehler avatar Jun 20 '25 13:06 bitfehler

Also, can you post the error message from ipmi-sel that should show up in the exporter log?

bitfehler avatar Jun 20 '25 13:06 bitfehler

source=collector.go:120 msg="Collector failed" name=sel-events error="error running ipmi-sel: exit status 1: ipmi-sel: connection timeout\n"
source=collector.go:120 msg="Collector failed" name=bmc error="error running bmc-info: exit status 1: bmc-info: connection timeout\n"
source=collector.go:120 msg="Collector failed" name=ipmi error="error running ipmimonitoring: exit status 1: /usr/sbin/ipmi-sensors: connection timeout\n"

mekanix avatar Jun 20 '25 13:06 mekanix

Ok, this already looks quite different then. Looking at your config in the initial post, this looks a bit odd, actually. Are you running an exporter on every host you're trying to scrape? Or do you have one central exporter instance for all hosts?

bitfehler avatar Jun 20 '25 13:06 bitfehler

I have a central exporter instance.

mekanix avatar Jun 20 '25 13:06 mekanix

If we assume I have a centralized ipmi-exporter and prometheus instance on the same server, what should the config look like? The reason for centralized node is that we have JBODs with IPMI so we can't install exporter on them. Thank you!

mekanix avatar Jun 20 '25 18:06 mekanix

Does your prometheus config not have a relabeling rule for setting __address__ and __param_target? Or did you just omit those?

bitfehler avatar Jun 23 '25 08:06 bitfehler

Sorry, just my bad copy/paste. This is the whole section

- file_sd_configs:
  - files:
    - /etc/prometheus/ipmi-targets.yml
    refresh_interval: 10m
  job_name: ipmi-exporter
  metrics_path: /ipmi
  params:
    module:
    - default
  relabel_configs:
  - action: replace
    regex: (.*)
    replacement: ${1}
    separator: ;
    source_labels:
    - __address__
    target_label: __param_target
  - action: replace
    regex: (.*)
    replacement: ${1}
    separator: ;
    source_labels:
    - __param_target
    target_label: instance
  - action: replace
    regex: .*
    replacement: localhost:9290
    separator: ;
    target_label: __address__
  scheme: http
  scrape_interval: 5m
  scrape_timeout: 5m

mekanix avatar Jun 23 '25 08:06 mekanix

Ok, that looks about right then. And is the exporter running in Docker or native?

bitfehler avatar Jun 23 '25 08:06 bitfehler

Native

mekanix avatar Jun 23 '25 08:06 mekanix

Ok, then this is a bit strange, indeed. I just noticed one thing in the config you posted, but not really sure that this is the issue:

My prometheus.yml portion for this exporter is this

- file_sd_configs:
  - files:
    - /etc/prometheus/ipmi-targets.yml
    refresh_interval: 10m
  job_name: ipmi-exporter
  metrics_path: /ipmi
  params:
    module:
    - default

The last part should probably be:

  params:
    - module: "default"

But again, not sure this really makes a difference.

Now, given your initially posted config snippet, the config file that gets generated for FreeIPMI would look like this:

driver-type LAN_2_0
privilege-level user
username myuser
password mypass

Can you create this files with the appropriate user/password values and run the ipmi-sel command from your initial post, just replacing the value for --config-file to point to this file?

Please make sure to to run this command on the host that the exporter runs on and also as the user that the exporter runs as.

bitfehler avatar Jun 23 '25 10:06 bitfehler

I went through the list and there are 3 types of nodes that I encountered:

  1. Not pingable at all (no wonder there's timeout)
  2. Node is working fine, but there were network glitches (or something) so it ended up in the log
  3. Node doesn't accept LAN_2_0

Of course, first two are to be ignored. Is there a way to specify driver-type per node or I would have to run two exporters: one for LAN_2_0 and one without it?

mekanix avatar Jul 01 '25 10:07 mekanix

You have two options. Recall the

params:
  - module: "default"

in the prometheus config.

  1. To keep your exporter config simple, split your targets in two batches in the prometheus config. For the first one, keep module: default. For the second one, set e.g. module: otherdriver. In your exporter config, create a second module otherdriver with the desired settings (see the example config)
  2. To keep your prometheus config simple, instead of setting the module as fixed param, add a relabeling rule that sets module to the IP of the target host. This means that in your exporter config, you'll have to create a module for every target host. This may seem cumbersome, but many folks generate their exporter config anyways, due to unique passwords or other circumstances.

None of these options is "better" than the other, it really just depends on what makes more sense to you. Hope that helps.

bitfehler avatar Jul 11 '25 08:07 bitfehler