fortigate_exporter Timeout when querying secondary unit in HA mode

I have observed this while troubleshooting fortigate_exporter timeouts, that happens sporadically across our fortigate fleet. normally scraping of one device would take 2-3 seconds. however in case of secondary unit in HA cluster, same scrape takes close to 25s. sporadically spilling over 30s default timeout.

It looks like the root cause it an API call to /api/v2/monitor/system/fortimanager/status?vdom=*. on a secondary that takes 10-20sec, and occasionally even longer.

I'm not sure if this is something you want to deal with, or it's more of a fortigate issue, however I'm creating issue here to document this behaviour.

Mar 24 '23 03:03 p-v-a

+1, came here to say the same thing. Scraping devices that are not primary is causing issues with false reports of devices being down.

2023/07/04 15:48:34 Error: Get "https://fortigatehostname.my.tld/api/v2/monitor/system/fortimanager/status?vdom=*": context canceled
2023/07/04 15:48:34 Error: Get "https://fortigatehostname.my.tld/api/v2/monitor/system/ha-statistics": context canceled
2023/07/04 15:48:34 Error: Get "https://fortigatehostname.my.tld/api/v2/monitor/system/interface/select?vdom=*&include_vlan=true&include_aggregate=true": context canceled
2023/07/04 15:48:34 Error: Get "https://fortigatehostname.my.tld/api/v2/monitor/system/link-monitor?vdom=*": context canceled
2023/07/04 15:48:34 Error: Get "https://fortigatehostname.my.tld/api/v2/monitor/system/resource/usage?interval=1-min&scope=global": context canceled
2023/07/04 15:48:34 Warning: Get "https://fortigatehostname.my.tld/api/v2/monitor/system/sensor-info?vdom=root": context canceled
2023/07/04 15:48:34 Error: Get "https://fortigatehostname.my.tld/api/v2/monitor/system/status": context canceled
2023/07/04 15:48:34 Error: Get "https://fortigatehostname.my.tld/api/v2/monitor/system/resource/usage?interval=1-min&vdom=*": context canceled
2023/07/04 15:48:34 Error: Get "https://fortigatehostname.my.tld/api/v2/monitor/system/ha-checksums?scope=global": context canceled
2023/07/04 15:48:34 Probe of "https://fortigatehostname.my.tld" failed, took 30.000 seconds

Jul 04 '23 15:07 aaronnad

Yes, I ended up separating scrape jobs. one that scrape each fortigate host and includes only metrics that makes sense for individual box, like that:

      probes:
        include:
          - System/SensorInfo
          - System/Status
          - System/Time/Clock
          - System/Resource/Usage
          - License/Status
          - WebUI/State

and then second one that scrapes cluster VIP and excludes metrics above:

      probes:
        exclude:
          - System/SensorInfo
          - System/Status
          - System/Time/Clock
          - System/Resource/Usage
          - License/Status
          - WebUI/State

Jul 11 '23 00:07 p-v-a

fortigate_exporter fortigate_exporter copied to clipboard

Timeout when querying secondary unit in HA mode

fortigate_exporter
fortigate_exporter copied to clipboard