Enable / Disable plugin instances
Use Case
In one of our deployments, we plan to deploy multiple instances of telegraf and all of them fetching the config from an HTTP endpoint. Each running instance is only supposed to "activate" certain set of plugins for which it is configured. Although there is a way to achieve it by some tedious configurations, I think a more elegant solution would be to have a method similar to what k8s does using its Labels and Selector logic.
There have been requests for this feature dating back to 2016! https://github.com/influxdata/telegraf/issues/9304 https://github.com/influxdata/telegraf/issues/1317 https://github.com/influxdata/telegraf/issues/10543
Expected behavior
Telegraf runs with only "enabled" plugins
Actual behavior
No such provision directly.
Additional info
Few of the solutions offered previously- Comment the plugin to "disable" it. This works, but for a centralized config this approach is not feasible. Rename the file to ignore. Since Telegraf only reads .conf files, we can effectively disable set of plugins if we rename the file to something like .conf.ignore. This approach is complex to manage. Good for debugging.
The proposed solutions using selectors and labels makes it extensible. To ensure backward compatibility, we need to define certain rules.
Compatibility Matrix: Telegraf Plugin Selector System
| Telegraf Run State | Selector Present | Behavior |
|---|---|---|
Running with --label |
Selector Present | Plugin is selected if the selector matches the label. Else if selector doesn't match, plugin is not selected. |
| Selector Not Present | Plugin is selected (backward compatibility). | |
Running without --label |
Selector Present | Plugin is selected (no label to compare against). |
| Selector Not Present | Plugin is selected (current behavior). |
In essence, if we want to disable the plugin, all we need to do is run telegraf with a label and without a matching selector. It will act as a negative selector.
For users who wish to take full advantage of this feature, will eventually provide matching selectors for their running instances of telegraf.
As an example, we have the following
telegraf --config-directory selectors/ --label="app:frontend" --label="drop:news-1" --watch-config=notify --print-plugin-config-source=true
Lets say we have
selectors/
one.conf
two.conf
one.conf
[[outputs.http]]
selector = {app = "frontend"}
[[inputs.mem]]
selector = {app = "frontend2"}
[[inputs.cpu]]
selector = {app = "backend"}
two.conf
[[inputs.disk]]
selector = {app = "frontend", drop = "news-2"}
[[inputs.netstat]]
[inputs.netstat.selector]
app = "frontend"
drop = "news-1"
[[inputs.powerdns]]
selector = {app = "frontend"}
The following would be selected
outputs.http
inputs.netstat
inputs.powerdns
The following would be dropped
inputs.mem
inputs.cpu
inputs.disk
Suggestions are welcome to improve this. I have created a PR for the same #16705
I would propose to use array syntax for selectors, that makes it similar to the existing metric filtering config.
Ex:
[[inputs.netstat]]
[inputs.netstat.selector]
app = ["frontend*", "backend"]
drop = ["news-1"]
So it would be selected for any frontend or the backend app.
@neelayu hmmm why wouldn't a slice be enough something like
[[outputs.http]]
selectors = ["frontend", "backend"]
[[inputs.mem]]
selectors =["backend"]
[[inputs.cpu]]
selectors =["backend"]
[[inputs.modbus]]
selectors =["factory"]
be enough? This way the checking is pretty easy I think.
My biggest concern is the complexity this might grow into. What if you specify
# telegraf --label frontend --label factory
Does it mean "frontend" and "factory" or does it mean "frontend" or "factory"? In any case I would like to ask you to write a spec for this so the behavior is clear for everyone... See https://github.com/influxdata/telegraf/tree/master/docs/specs for other specs we already have.
I see the confusion-
The --label flag is essentially what k8s has in metadata.labels. It is a way of attaching objects. In a way, it's just a list.
The selectors on the other hand are similar to k8s matchLabels. ie all selectors should match labels (Selectors are supposed to be subset of labels)
I have kept it similar to k8s since it provides a way to expand our capabilities in the future to incorporate selections based on matchExpressions and operator.
Let me know your thoughts. As @Hipska also mentions, wildcardmatching is also an option. But this gives us a strict matching.
@neelayu I'm not familiar with k8s selection mechanism but I don't want to make things more complicated in Telegraf as they need to be. Using maps has the drawback that, if you use tables for declarations, the table must be at the end of the plugin declaration. Furthermore, we can still have label pairs by doing
[[inputs.foo]]
selectors = ["apps.backend", "source.backend", ...]
or
[[inputs.foo]]
selectors = ["apps=backend", "source=backend", ...]
If you then use a filter expression and specify the dot (.) or equal-sign (=) as separator you can easily do expressions like *.backend or apps.*.
Yes. I am aware of the TOML limitations for tables or maps. Your proposal for syntax selectors=["apps=backend"] seems elegant and captures the necessary information. Just to double check, can you also confirm if we want to support wildcards like *? I am not inclined towards it as it will make things complicated.
If that is the case, I can draft up a spec for the same. Thanks!
Well the idea for the wildcards comes from the fact that we see this often. So what you can do is to use a filter which should not make things complicated with wildcards. However, if you do have doubts, we can start off with exact matches but belief me the code will not be much simpler. ;-)
Please also think about and and or operations e.g. being able to specify --select multiple times means an or and a comma-separated list of statements means an and e.g. --select 'apps=backend, source=factory*' --select 'source=frontend' would mean select everything that has (apps == backend AND source 'starts with' factory) OR source == frontend. We don't need this in the first implementation but I know this will be a feature request sooner or later...
A spec (see https://github.com/influxdata/telegraf/tree/master/docs/specs) would be good though to have things documented.
Hi @srebhan and @Hipska I managed to draft up the spec. https://github.com/influxdata/telegraf/pull/16884
I believe we can go with wildcard matching using filter. The document specifies all the necessary information. Thanks!
Hi @srebhan and @Hipska, sorry for pinging you, but did you get a chance to look at the spec? thanks!
I did and approved 😉
Thanks @Hipska