telegraf icon indicating copy to clipboard operation
telegraf copied to clipboard

feat(outputs.influxdb_v2): Report internal statistics on errors and written bytes

Open LarsStegman opened this issue 5 months ago • 13 comments

Summary

We want to keep track of how many bytes we are sending over a satellite connection, since we sometimes accidentally push away other traffic. I am not sure how (and if) to test these new metrics.

Checklist

  • [x] No AI generated code was used in this PR

Related issues

resolves #17275

LarsStegman avatar Jul 02 '25 08:07 LarsStegman

Done, the failing integration test is not related to my changes

LarsStegman avatar Jul 07 '25 07:07 LarsStegman

@LarsStegman your approach unfortunately has some issues...If people define some tags or an alias your metric it will not be possible to associate your metric to the one of the plugin. However, this is not solvable from within the plugin as it will not have access to this information.

I'm working on a PR for allowing plugins to export statistics with those information added. Hope to put up a spec today and I do have a PoC I can share so you can base your PR on. Will share both PRs here later today...

Sorry for not solving this issue earlier...

srebhan avatar Jul 16 '25 09:07 srebhan

Hey Sven, no problem. I already suspected I was missing something, because this way of creating stats is not used anywhere else as far as I can see.

LarsStegman avatar Jul 16 '25 16:07 LarsStegman

@LarsStegman please let check the spec in PR #17344 and a draft PR for adding the framework in PR #17345.

srebhan avatar Jul 16 '25 18:07 srebhan

@srebhan updated to use the selfstat.Collector interface

LarsStegman avatar Aug 29 '25 08:08 LarsStegman

@srebhan I included your suggestions. Now there is only the url tag for the internal_outputs.influxdb_v2 measurement. The write measurement does not have the url tag.

LarsStegman avatar Sep 09 '25 11:09 LarsStegman

Download PR build artifacts for linux_amd64.tar.gz, darwin_arm64.tar.gz, and windows_amd64.zip. Downloads for additional architectures and packages are available below.

:relaxed: This pull request doesn't significantly change the Telegraf binary size (less than 1%)

:package: Click here to get additional PR build artifacts

Artifact URLs

. DEB . RPM . TAR . GZ . ZIP
[[amd64.deb aarch64.rpm darwin_amd64.tar.gz windows_amd64.zip] [arm64.deb armel.rpm darwin_arm64.tar.gz windows_arm64.zip] [armel.deb armv6hl.rpm freebsd_amd64.tar.gz windows_i386.zip] [armhf.deb i386.rpm freebsd_armv7.tar.gz ] [i386.deb ppc64le.rpm freebsd_i386.tar.gz ] [mips.deb riscv64.rpm linux_amd64.tar.gz ] [mipsel.deb s390x.rpm linux_arm64.tar.gz ] [ppc64el.deb x86_64.rpm linux_armel.tar.gz ] [riscv64.deb linux_armhf.tar.gz ] [s390x.deb linux_i386.tar.gz ] [ linux_mips.tar.gz ] [ linux_mipsel.tar.gz ] [ linux_ppc64le.tar.gz ] [ linux_riscv64.tar.gz ] [ linux_s390x.tar.gz ]]

telegraf-tiger[bot] avatar Sep 12 '25 09:09 telegraf-tiger[bot]

Sorry, it's been a busy couple of weeks. I've made the requested changes.

LarsStegman avatar Oct 01 '25 07:10 LarsStegman

To be honest I don't like the idea of having a metric of the form internal_<plugin name> as this makes it hard to write queries across multiple different plugins IMO. Furthermore, we already have a tag to distinguish the plugin types! What I would accept though is something like write_extra or write_details or similar which can be used across multiple plugins with identical field names (if the plugins can provide them). This way you could e.g. accumulate the number of timeouts across multiple plugins and trigger an alert...

srebhan avatar Oct 08 '25 09:10 srebhan

Is it an idea to write down a spec and discuss it in that PR? Maybe extend this one or create a new one specifically for the schema.

This one can wait for a little bit until it's agreed what the internal metrics should look like. I am not sure where your priorities are, but maybe it's time for bigger change on how internal metrics are handled?

LarsStegman avatar Oct 08 '25 11:10 LarsStegman

@LarsStegman that's an excellent idea. There should be a spec defining a naming scheme for statistics including what we already have. Would appreciate if you and/or @Hipska and every interested party can start and discuss this!

srebhan avatar Oct 10 '25 08:10 srebhan

I would also like to see this extended in the existing TSD-011. A consistent naming for new selfstat metrics that don't conflict with existing ones is preferable.

Hipska avatar Nov 05 '25 14:11 Hipska

FYI, this is what I found in inputs.internal regarding to naming of metrics from inside a plugin:

internal_<plugin_name> are metrics which are defined on a per-plugin basis, and usually contain tags which differentiate each instance of a particular type of plugin and version=<telegraf_version>.

Hipska avatar Nov 07 '25 15:11 Hipska

@LarsStegman any chance you work on a spec?

srebhan avatar Dec 17 '25 09:12 srebhan

Hi Sven, sorry, I don't think I will have time to do that soon. Maybe somewhere in February (or in my holidays in the next two weeks, if I'm bored :P ), but I can't make any promises. If you need it soon, I think you're better off writing something yourselves. I'm happy to review it though!

LarsStegman avatar Dec 17 '25 10:12 LarsStegman

I'm also out till January 12 so don't expect anything... Shall we close this PR until we do have a concrete way forward?

srebhan avatar Dec 19 '25 14:12 srebhan