snmp_exporter icon indicating copy to clipboard operation
snmp_exporter copied to clipboard

[Discussion] Add 'snmp_' prefix to metrics?

Open seanhoughton opened this issue 8 years ago • 42 comments

The first item in the exporter metric naming best practices list is

A metric name ... should have a (single-word) application prefix relevant to the domain the metric belongs to.

  • https://prometheus.io/docs/practices/naming/

None of the metric names in the current file adhere to this practice, it probably makes sense to just automatically prefix each metric with snmp_

seanhoughton avatar Jan 06 '17 21:01 seanhoughton

In the case of SNMP, all the names have already been well-established over decades and follow another naming scheme. Adding snmp_ would likely cause more harm than good.

brian-brazil avatar Jan 06 '17 21:01 brian-brazil

I agree, the naming of SNMP is too well-established.

Richard

Sent by mobile; excuse my brevity.

RichiH avatar Jan 07 '17 05:01 RichiH

Correct, the individual metrics have long-established names, but by Prometheus policy these should be grouped into their domain. We shouldn't modify the metric name, simply prepend them.

snmp_ifHCInOctets for example makes it much easier to match when doing things like federation.

SuperQ avatar Jan 07 '17 08:01 SuperQ

It's a guideline not policy, and SNMP is the canonical example of where the balance tips the other way.

You wouldn't be federating names like that, you'd be federating on the aggregation prefix.

brian-brazil avatar Jan 07 '17 08:01 brian-brazil

I'll leave it up to you guys to decide, but the curse of "sure it doesn't follow convention and makes things more difficult, but that's the way we've always done it" has caused untold pain in my professional career. All of the other exporters provide a very clear prefix that makes browsing the available metrics for the exporter you're concerned with very easy. The snmp exporter is the only one in the bunch that scatters metrics all around because of the lack of a namespacing prefix. However, the nice thing about open source is that I can make a fork with the prefix added and go about my business :)

seanhoughton avatar Jan 08 '17 02:01 seanhoughton

All of the other exporters provide a very clear prefix that makes browsing the available metrics for the exporter you're concerned with very easy.

I mentioned this when this previously came up, but this is not true in general. It's normal for there to be quite a bit of overlap in the metrics between all of your applications, so this is not something you can depend on.

brian-brazil avatar Jan 08 '17 10:01 brian-brazil

For the record, after some more deliberations, I see prefixing snmp_ as a valid request. Having on shared namespace helps avoid confusion down the line.

Either way, this should either be implemented, or the issue closed.

RichiH avatar Jan 26 '17 04:01 RichiH

I'm thinking about adding an optional prefix config item to the generator. That would solve it for everyone.

SuperQ avatar Jan 26 '17 08:01 SuperQ

This should not be a configurable option, a given metric should only have one name.

No new arguments have been presented since this first came up, so I'm going to close it.

brian-brazil avatar Jan 26 '17 09:01 brian-brazil

Having two different names is worse than no name space. People will, and should, start sharing queries etc.

RichiH avatar Jan 26 '17 10:01 RichiH

My frustration with this is that it decreases the usefulness of autocomplete. For example, searching for sys does not suggest me sysUpTime, while searching prometheus_sd suggests me prometheus_sd_updates_total.

Because it breaks convention, also causes some inconveniences in some other tools like Grafana:

image

As you can see, Grafana groups metrics by their underscore-seperated names. Any use of the snmp exporter makes this menu useless, as it is now cluttered with hundreds of snmp entries. Now, arguably this is a Grafana issue, but it won't be the first tool that makes assumptions about the format of metric names that becomes less useful when you use the snmp exporter.

NotAFile avatar Jan 19 '20 16:01 NotAFile

We could prefix snmp__ in anticipation of OM.

Sent by mobile; please excuse my brevity.

On Sun, Jan 19, 2020, 17:41 NotAFile [email protected] wrote:

My frustration with this is that it decreases the usefulness of autocomplete. For example, searching for sys does not suggest me sysUpTime, while searching prometheus_sd suggests me prometheus_sd_updates_total.

Because it breaks convention, also causes some inconveniences in some other tools like Grafana:

[image: image] https://user-images.githubusercontent.com/5447747/72684428-a09dc180-3ae0-11ea-9b50-f64e018cd744.png

As you can see, Grafana groups metrics by their underscore-seperated names. Any use of the snmp exporter makes this menu useless, as it is now cluttered with hundreds of snmp entries. Now, arguably this is a Grafana issue, but it won't be the first tool that makes assumptions about the format of metric names that becomes less useful when you use the snmp exporter.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prometheus/snmp_exporter/issues/104?email_source=notifications&email_token=AAFYII4R7IZZPFMZSMJMINDQ6R7DTA5CNFSM4C3USPRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJKWR4I#issuecomment-576022769, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFYII6PUQF76OP73EAEM5DQ6R7DTANCNFSM4C3USPRA .

RichiH avatar Jan 19 '20 16:01 RichiH

There's still no plans to ever do that, and it'd be a major breaking change.

brian-brazil avatar Jan 19 '20 16:01 brian-brazil

I just found this issue and I still do not understand why it isn't a supported feature. Not namespacing the metrics is directly against the Prometheus conventions for naming metrics. It doesn't matter if you're following the SNMP convention because this isn't SNMP, it's a Prometheus exporter.

robotmay avatar Mar 20 '20 16:03 robotmay

If anyone else finds this issue looking for an actual solution, put the following in your job config in prometheus.yml:

    metric_relabel_configs:
    - source_labels: [__name__]
      regex: '(.*)'
      replacement: 'snmp_${1}'
      target_label: __name__

robotmay avatar Mar 20 '20 16:03 robotmay

We are planning to have a breaking release. This is a once-in-a-decade (seriously...) opportunity to revisit this topic. I am undecided myself as there are strong arguments either way, but we should deliberate it one last time.

Pro:

  • Fits into Prometheus patterns
  • Makes discovery of SNMP values easier

Con:

  • Breaks literally everything. All alerts, all dashboards, all everything
  • Makes grouping and such harder as labels names are e.g. ifDescr and grouping with snmp_ifDescr is painful. This alone might mean we might need to extend PromQL.

RichiH avatar Feb 25 '21 10:02 RichiH

Put differently: A networking-focused org would likely tend to prefer no prefixes while a more general org would likely tend to prefer a prefix.

There's no good answer here, only a potentially least-bad one.

If in doubt, not breaking ALL the things over night is the preferable option.

RichiH avatar Feb 25 '21 10:02 RichiH

Could it just be a configurable option? I guess it's not especially difficult to use the relabel config above, but having something simple like prefix_metrics: true might make it more obvious :thinking:

Also I have just reread my comments above and they're grumpier than I intended, so I apologise for that :sweat_smile:

robotmay avatar Feb 25 '21 10:02 robotmay

I've been thinking about adding an in-exporter metric relabel config, similar to Prometheus itself.

It's duplication of work and code, but it allows for simpler downstream. Plus we can fix other issues like the recent nulls in labels issues.

SuperQ avatar Feb 25 '21 11:02 SuperQ

Have you considered to make the module part of the prefix? IMHO this would be a huge benefit with regards to discovery. i.e. having a prefix of snmp_apcups_ and snmp_paloalto_fw_ may come in quite handy. Not sure on the implications on prometheus.

momorientes avatar Feb 25 '21 12:02 momorientes

That last suggestion will break even more stuff.. I would respectfully suggest against it. ;-)

xkilian avatar Feb 25 '21 13:02 xkilian

I have to admit that having an snmp_ prefix would fix the autocomplete stuff. But if ever there was a time, this is it. It would break some dashboards on my end. Not that many, but still. A company that have LOTS of snmp dashboard, ouch. I think there needs to be good feedback to get the community pulse on this.

xkilian avatar Feb 25 '21 13:02 xkilian

Have you considered to make the module part of the prefix? IMHO this would be a huge benefit with regards to discovery. i.e. having a prefix of snmp_apcups_ and snmp_paloalto_fw_ may come in quite handy. Not sure on the implications on prometheus.

I think this is a very bad idea, as it makes it impossible to aggregate metrics across modules. For example, paloalto_fw and if_mib both have ifInOctets metrics, the prefix would prevent any reuse over devices.

NotAFile avatar Feb 25 '21 15:02 NotAFile

Again, this is why I think metric_relabel_configs in the generator/snmp config is the right way to go. Users can add their own prefix, suffix, whatever they want.

Something like this:

modules:
  # Default IF-MIB interfaces table with ifIndex.
  if_mib:
    walk: [sysUpTime, interfaces, ifXTable]
    metric_relabel_configs:
    - source_labels: [__name__]
      target_label: __name__
      replacement: 'snmp_$1'

SuperQ avatar Feb 25 '21 15:02 SuperQ

Module-specific prefixes are not a good idea, but the job label can help here.

Relabeling is very powerful, but it's also one of the pain point for users.

For a highly scoped question of snmp_ yes/no, it seems better to have an option if the users should be able to choose.

But: I do not believe we can support both at the same time. Sharing dashboards, alert rules, etc becomes next to impossible even we assume Mixins were magically widely used over night.

All that is a long-winded way of saying I think we will need to pick one or the other and live with it.

As such, it's more about finding the least bad, holding our noses, and doing it forevermore.

RichiH avatar Feb 26 '21 11:02 RichiH

I would vote to do it in a consistent way, so add snmp_ to all. Then if someone has way too many dashboards, they can use proposed override regex to fix it until they correct their dashboards. This will address the consistency and moving forward all published dashboards would use the new prefix. We all agree that snmp_ prefix is better, so I think this would be way to do it. In the doc we include an example to revert the prefix.. All bases covered.

xkilian avatar Feb 27 '21 13:02 xkilian

After a lot of thinking, I am leaning towards keeping as-is forevermore. I am reaching out to a few people to get actively challenged on this as my thinking might be too network-centric.

RichiH avatar Mar 15 '21 07:03 RichiH

Note to self: If we don't do prefixes, we need to create docs, FAQ, etc.

RichiH avatar Mar 15 '21 07:03 RichiH

Hi, if you started with prometheus recently, you probably would vote for metric prefix, but if you have snmp and prometheus running for many years because you have an 5+ years retention in your TSDB this is a totally no go. This will completely break your dashboards and history for your metrics.

For one metric in the dashboard, you need to add two metric queries, one without the prefix and one with the prefix, because you want your graph to continue if you change the time window. You don't wanna change dashboards because of date X you changed the metric name or you don't wann break the graph line into multiple metrics and labels, because this will also break your calculations for trends/results on longterm. If there is no chance to rename alle your "snmp" metrics in your tsdb within a short time, as a longterm snmp/prometheus user this is just a big mess.

We will have no gain of this cosmetic snmp_ prefix change, really. I'm working for about 5 years now with snmp/prometheus, we have 10 year retention on the TSDB, I have never used this auto grouping of metrics in Grafana or really missed that "snmp_" was missing. That a metric was imported by snmp_exporter is a part of my label and if there are two different exporters who export the same "snmp" metric but via different protocol, I would prefer they have the same metric, so I can still use the same dashboard and use a label to switch between the exporters.

The migration from the old snmp_exporter to the new one with the generator, messed up so many metrics and ruined many longterm statistics.

Anyway, I understand the problem of the missing snmp_ and it should be well documented, especially how to fix it (config flag, or metric relabel) and what it will break if you change this option. prefix-flag not configured (not in config file) -> no prefix; prefix-flag configured: displayed in all example configuration files with enabled as default with short warning and link to documentation that it will cause you pain because you will mess up your tsdb history for your metric history.

Did you checked all other exporters in the public internet if they are using this prefix method?

These are just my first thoughts on this topic.

LukeSiX avatar Mar 15 '21 09:03 LukeSiX

I also don't really see a big gain in prefixing the metrics name with snmp_.

Just like @brian-brazil and @SuperQ stated before - the names have a history for decades and from my point of view these metric names should be in sync with what the MIB files declare. People tend to use other monitoring systems as well, changing metric names will make it harder to co-operate with these systems as well.

bastischubert avatar Mar 15 '21 10:03 bastischubert