harvest icon indicating copy to clipboard operation
harvest copied to clipboard

Rest fails to retrieve private/cli metrics

Open frankvdbh opened this issue 1 year ago • 3 comments

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please let us know in a comment

Problem

The node > healthy status seems to be only reported when we enable Zapi. When only the Rest collector is enabled, or listed first in the collectors, it seems the normal Node counters are being collected, but this part of counters from /api/private/cli/node is not returning data and we have no "healthy" field:

endpoints:
  - query: api/private/cli/node
    counters:
      - ^^node                                        => node
      - ^health                                       => healthy
      - ^max_aggr_size                                => max_aggr_size
      - ^max_node_vvols                               => max_vol_num
      - ^max_vol_size                                 => max_vol_size
      - ^vendor                                       => vendor
      - cpu_busy_time                                 => cpu_busytime

When we manually try to browse to the API we get the results returned: https://<IP>/api/private/cli/node?fields=health,node,vendor

We seem to have this issue with both 9.8.0, 9.13.1. We want to try to only use Rest as to not have to extend both Rest and Zapi configuration files with additional metrics.

Configuration

Admin:
Tools:
Exporters:
    influxdb:
        exporter: InfluxDB
        addr: -REDACTED-
        bucket: test-harvest
        org: -REDACTED-
        token: -REDACTED-
Defaults:
    use_insecure_tls: true
Pollers:
    unix:
        datacenter: local
        addr: -REDACTED-
        collectors:
            - Unix
        exporters:
            - influxdb
    cl01:
        datacenter: dc01
        addr: -REDACTED-
        auth_style: basic_auth
        username: -REDACTED-
        password: "-REDACTED-"
        use_insecure_tls: true
        exporters:
            - influxdb
        collectors:
            - Zapi
            - ZapiPerf
            - Rest
            - RestPerf
        labels:
            - customer: cus1
            - instance: cl01
    cl02:
        datacenter: dc02
        addr: -REDACTED-
        auth_style: basic_auth
        username: -REDACTED-
        password: "-REDACTED-"
        use_insecure_tls: true
        exporters:
            - influxdb
        collectors:
            - Rest
            - RestPerf
            - Zapi
            - ZapiPerf
        labels:
            - customer: cus1
            - instance: cl02
    cl3:
        datacenter: dc3
        addr: -REDACTED-
        auth_style: basic_auth
        username: -REDACTED-
        password: "-REDACTED-"
        use_insecure_tls: true
        exporters:
            - influxdb
        collectors:
            - Rest
            - RestPerf
            - Zapi
            - ZapiPerf
        labels:
            - customer: cus1
            - instance: cl3

Poller

Rest

Version

23.11

Poller logs

No response

OS and platform

Ubuntu 20.04

ONTAP or StorageGRID version

9.8.0, 9.13 (simbox)

Additional Context

No response

References

No response

frankvdbh avatar Nov 16 '23 11:11 frankvdbh

@frankvdbh Can you you execute the following API call and see if it throws any error?

https://<IP>/api/private/cli/node?fields=health,node,max_aggr_size,max_node_vvols,max_vol_size,vendor,cpu_busy_time

Please note that the REST collector is compatible with ONTAP versions 9.12 and later. More details here. I noticed that you're using version 9.13 on a simbox. If possible, could you test this API call on a physical hardware setup?

Also Can you share your logs with us. Please send them to [email protected]. You can find instructions on how to collect these logs here.

rahulguptajss avatar Nov 16 '23 12:11 rahulguptajss

Hi @rahulguptajss I have tested this API call on both a 9.8 physical system and 9.13.1 simbox. In both cases we get a response similar to this:

{
  "records": [
    {
      "node": "cl02-01",
      "vendor": "NetApp",
      "health": true,
      "cpu_busy_time": 22003,
      "max_aggr_size": 879609302220800,
      "max_vol_size": 329853488332800,
      "max_node_vvols": 500
    }
  ],
  "num_records": 1
}

I will collect the logs and send them by mail.

frankvdbh avatar Nov 16 '23 12:11 frankvdbh

hi @frankvdbh are you all sorted here or have anymore questions?

cgrinds avatar Feb 27 '24 17:02 cgrinds

@frankvdbh closing this one, please reopen if you get a chance to follow-up.

cgrinds avatar May 03 '24 18:05 cgrinds