feat(outputs.influxdb_v2): Report internal statistics on errors and written bytes
Summary
We want to keep track of how many bytes we are sending over a satellite connection, since we sometimes accidentally push away other traffic. I am not sure how (and if) to test these new metrics.
Checklist
- [x] No AI generated code was used in this PR
Related issues
resolves #17275
Done, the failing integration test is not related to my changes
@LarsStegman your approach unfortunately has some issues...If people define some tags or an alias your metric it will not be possible to associate your metric to the one of the plugin. However, this is not solvable from within the plugin as it will not have access to this information.
I'm working on a PR for allowing plugins to export statistics with those information added. Hope to put up a spec today and I do have a PoC I can share so you can base your PR on. Will share both PRs here later today...
Sorry for not solving this issue earlier...
Hey Sven, no problem. I already suspected I was missing something, because this way of creating stats is not used anywhere else as far as I can see.
@LarsStegman please let check the spec in PR #17344 and a draft PR for adding the framework in PR #17345.
@srebhan updated to use the selfstat.Collector interface
@srebhan I included your suggestions. Now there is only the url tag for the internal_outputs.influxdb_v2 measurement. The write measurement does not have the url tag.
Download PR build artifacts for linux_amd64.tar.gz, darwin_arm64.tar.gz, and windows_amd64.zip. Downloads for additional architectures and packages are available below.
:relaxed: This pull request doesn't significantly change the Telegraf binary size (less than 1%)
:package: Click here to get additional PR build artifacts
Artifact URLs
Sorry, it's been a busy couple of weeks. I've made the requested changes.
To be honest I don't like the idea of having a metric of the form internal_<plugin name> as this makes it hard to write queries across multiple different plugins IMO. Furthermore, we already have a tag to distinguish the plugin types! What I would accept though is something like write_extra or write_details or similar which can be used across multiple plugins with identical field names (if the plugins can provide them). This way you could e.g. accumulate the number of timeouts across multiple plugins and trigger an alert...
Is it an idea to write down a spec and discuss it in that PR? Maybe extend this one or create a new one specifically for the schema.
This one can wait for a little bit until it's agreed what the internal metrics should look like. I am not sure where your priorities are, but maybe it's time for bigger change on how internal metrics are handled?
@LarsStegman that's an excellent idea. There should be a spec defining a naming scheme for statistics including what we already have. Would appreciate if you and/or @Hipska and every interested party can start and discuss this!
I would also like to see this extended in the existing TSD-011. A consistent naming for new selfstat metrics that don't conflict with existing ones is preferable.
FYI, this is what I found in inputs.internal regarding to naming of metrics from inside a plugin:
internal_<plugin_name>are metrics which are defined on a per-plugin basis, and usually contain tags which differentiate each instance of a particular type of plugin andversion=<telegraf_version>.
@LarsStegman any chance you work on a spec?
Hi Sven, sorry, I don't think I will have time to do that soon. Maybe somewhere in February (or in my holidays in the next two weeks, if I'm bored :P ), but I can't make any promises. If you need it soon, I think you're better off writing something yourselves. I'm happy to review it though!
I'm also out till January 12 so don't expect anything... Shall we close this PR until we do have a concrete way forward?