telegraf
telegraf copied to clipboard
feat(parsers.avro): Add Apache Avro parser plugin
Required for all PRs
- [x] Updated associated README.md.
- [x] Wrote appropriate unit tests.
- [x] Pull request title or commits are in conventional commit format
resolves #1630
This is a replacement for https://github.com/influxdata/telegraf/pull/7732 since the original author (@emanuele-falzone ) has gone silent.
This builds on Emanuele Falzone's work to allow ingestion from Avro serialized format. It can either connect to a schema registry or a schema can be specified in the parser.
It also extends the internal time parser to allow use of floating-point timestamps with specified precision, which is a feature we need for the Rubin Observatory's use-case.
@athornton please also rebase to latest master to get CircleCI back functional.
Thank you for the detailed review. I'll get to work on it. I have no objection to a round_timestamps_to config item rather than my initial implementation (the reason to round at all basically comes down to https://docs.influxdata.com/influxdb/v2.4/reference/faq/#does-the-precision-of-the-timestamp-matter , and I at least find it much easier to eyeball the data if all digits past the precision we care about are zero rather than the deterministic-but-basically-random stuff that the conversion gives us).
Also take note that a lot of the motivation for what I'm doing here really isn't obvious until https://github.com/influxdata/telegraf/pull/11831 . Basically, this is the avro parsing piece that will then make it possible to extend the Kafka input consumer with dynamic regexp-based topic detection and periodic scanning for new topics. These are features we need (and that streamreactor gives us but only for InfluxDBv1).
@srebhan I think I'm a little stuck. I've rebased against current master and CircleCI is still not working, and I don't know how to tell test_registry.go to not invoke the backwards-compatibility check.
adam@ixitxachitl:~/git/telegraf/plugins/parsers$ go test
--- FAIL: TestRegistry_BackwardCompatibility (0.01s)
registry_test.go:47: testing "csv"...
registry_test.go:47: testing "graphite"...
registry_test.go:47: testing "grok"...
registry_test.go:47: testing "nagios"...
registry_test.go:47: testing "value"...
registry_test.go:47: testing "xpath_json"...
registry_test.go:47: testing "influx_upstream"...
registry_test.go:47: testing "json_v2"...
registry_test.go:47: testing "prometheusremotewrite"...
registry_test.go:47: testing "wavefront"...
registry_test.go:47: testing "avro"...
registry_test.go:60:
Error Trace: /Users/adam/git/telegraf/plugins/parsers/registry_test.go:60
Error: Received unexpected error:
exactly one of 'schema_registry' or 'schema' must be specified
Test: TestRegistry_BackwardCompatibility
FAIL
exit status 1
My suspicion is that the CircleCI config isn't working because this is the lsst-sqre fork of Telegraf, so we really don't have CircleCI permissions. The configuration files are identical.
I added a very crude way to skip parsers that are not backward-compatible: a map at the top of the test file, and a check to see if the parser's name was in that map. Surely there's a better way of doing this, but I didn't find it. Happy to change it if someone points me in the right direction.
Other than that...I think I've addressed all the commentary except improving the test suite.
@athornton regarding timestamps, please also take a look at #11875! While not directly obvious, this changes the timestamp parsing to handle timestamps by keeping the precision to nanoseconds. If you need more, we should try to talk about this in a separate issue describing your expected behavior.
Oh, that timestamp work looks very nice. I'll try to get to it soon, and I will change the skip-compatibility test. Does going through big.Rat() cause any noticeable speed hit? I would think for high-volume metrics going through arbitrary precision might hurt.
The test suite is proving difficult for me--basically, without an InitFromConfig it's not clear to me how to load a new telegraf.conf for a parsing run and initialize the parser from it (Init() fails, obviously, because it wants schema or schemaURL, so I have to construct the Parser with fields set from the config so that Init() will work; but that means I need to extract them from the config and there's some unmarshaling I am missing) . I think I am about to end up basically manually doing what InitFromConfig did in my test suite.
I've added a file-based test suite, and use it to check for all the Init() errors (and a happy-path case). It could use some more attention, but I think this begins to get the point across.
@srebhan I've grabbed the changes from #11875 and applied them to this one. We've tested it for our use case and it works just fine, so I dropped our round_timestamp_to stuff.
@srebhan is there anything else you need me to do on this one?
@srebhan repinging: I think I've done what I can here (the CircleCI seems to legit be "I do not have permission to do so"). Is there anything else I need to do on this before it's ready to merge?
I'm going to close and re-open this PR to see if it can get circleci going again
I have a suspicion that I know what's going on. This is based on my experience working with conda. Could it be that CircleCI (like conda machinery) will only run from a personal fork, not an organizational fork? I know that some distinction between the two interfered with some CI process there. In that case it had to be a personal fork from the mainline repo--a personal fork from the organizational fork wouldn't work either.
Should I try forking and applying the changes? We'll lose the commit history, I think (which is complicated and messy anyway--the big problem with that is that we'll lose the credit for Emanuele Falzone, who, honestly, did most of the work here). But that'd be an easy enough test to see whether that's the problem at all.
will only run from a personal fork, not an organizational fork?
We have had some folks from Home Depot submit from their org, so I am not sure, but it did run yesterday:
https://app.circleci.com/pipelines/github/influxdata/telegraf/13402/workflows/e22fa15c-e1db-4448-b1ae-3dd54abf7664
I'm not sure why there is still a "CircleCI Pipeline" entry in the list of checks, but the other checks were from yesterday
I'd really love to be able to land this so I can shift over to the (much smaller) add regexp support to kafka topic ingestion PR that sits on top of this. Is there anything I can do at this point? The Home Depot data seems pretty conclusive that it's not a personal/org fork problem.
Is there anything I can do at this point?
Looks like the tests didn't even run as there are go.mod issues. Running go mod tidy and pushing the little change should get you a clean run. I can watch to see if tests run again as well.
After that I can ping @srebhan next week as it looked like you were close to wrapping this up.
OK. Looks like we're back to everything but CircleCI passing.
@athornton any update on this PR?
@athornton any way you finish this PR?
Yeah, sorry, I got pulled off to do other things and didn't even see the November commentary. I will get back to it and we can get this thing over the finish line.
@athornton THANKS! Please let me know if this is ready for another review round!
Not ready yet. I merged the suggestions that I was sure were uncontroversial, but some of the other stuff I'm going to need to see if I had a reason for doing it the way I did. I'd hoped to work on it today, but I got a high-priority interrupt. Hopefully tomorrow I get a little while to work on it.
I have a question. I had planned to do https://github.com/influxdata/telegraf/pull/11831 as a separate PR on top of this. However, I suspect that our use case, where we get Avro schemas (and update them dynamically) from a kafka broker is actually a very common one. Since I'm digging back into this one, rolling that one in might be a good idea, or I could try to land that one separately after this one is merged. Which do you think is a better idea?
I could try to land that one separately after this one is merged
I would much rather see two PRs over one very large one.
Still working on it. Got it passing its old unit tests again and now I'm on implementing the requested changes to the test suite, but I have had other stuff that needs attention today.
@srebhan I failed to understand how to use the file plugin and Gather to eliminate a bunch of the setup.
The CircleCI Linux failures, I think, may be because I am not running Golang 1.20 yet on my Mac (using Brew, golang is still at 1.19).
At any rate, the unit tests are again passing, as is make check/make test. If you wanted to just fix the test case to do the configuration correctly I'd appreciate it; I'm kind of swamped in my day job again. If instead of doing that, you would point me to docs on what it is I'm supposed to do with Gather I'll be happy to take another crack at it, but it might be a little while.
@athornton here is a diff to clean up the go mod tidy issues:
diff --git a/go.mod b/go.mod
index 3b95a4eca..1f992bfc8 100644
--- a/go.mod
+++ b/go.mod
@@ -107,7 +107,7 @@ require (
github.com/jackc/pgtype v1.12.0
github.com/jackc/pgx/v4 v4.17.1
github.com/james4k/rcon v0.0.0-20120923215419-8fbb8268b60a
- github.com/jeremywohl/flatten v1.0.1
+ github.com/jeremywohl/flatten/v2 v2.0.0-20211013061545-07e4a09fb8e4
github.com/jhump/protoreflect v1.8.3-0.20210616212123-6cc1efa697ca
github.com/jmespath/go-jmespath v0.4.0
github.com/kardianos/service v1.2.2
@@ -199,11 +199,6 @@ require (
modernc.org/sqlite v1.19.2
)
-require (
- github.com/jeremywohl/flatten/v2 v2.0.0-20211013061545-07e4a09fb8e4 // indirect
- github.com/pkg/xattr v0.4.9 // indirect
-)
-
require (
cloud.google.com/go v0.107.0 // indirect
cloud.google.com/go/compute v1.14.0 // indirect
@@ -380,6 +375,7 @@ require (
github.com/pion/udp v0.1.4 // indirect
github.com/pkg/browser v0.0.0-20210911075715-681adbf594b8 // indirect
github.com/pkg/sftp v1.13.5 // indirect
+ github.com/pkg/xattr v0.4.9 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/power-devops/perfstat v0.0.0-20220216144756-c35f1ee13d7c // indirect
github.com/rcrowley/go-metrics v0.0.0-20201227073835-cf1acfcdf475 // indirect
diff --git a/go.sum b/go.sum
index 0a3b55104..585339e8c 100644
--- a/go.sum
+++ b/go.sum
@@ -1378,8 +1378,6 @@ github.com/jcmturner/gokrb5/v8 v8.4.3 h1:iTonLeSJOn7MVUtyMT+arAn5AKAPrkilzhGw8wE
github.com/jcmturner/gokrb5/v8 v8.4.3/go.mod h1:dqRwJGXznQrzw6cWmyo6kH+E7jksEQG/CyVWsJEsJO0=
github.com/jcmturner/rpc/v2 v2.0.3 h1:7FXXj8Ti1IaVFpSAziCZWNzbNuZmnvw/i6CqLNdWfZY=
github.com/jcmturner/rpc/v2 v2.0.3/go.mod h1:VUJYCIDm3PVOEHw8sgt091/20OJjskO/YJki3ELg/Hc=
-github.com/jeremywohl/flatten v1.0.1 h1:LrsxmB3hfwJuE+ptGOijix1PIfOoKLJ3Uee/mzbgtrs=
-github.com/jeremywohl/flatten v1.0.1/go.mod h1:4AmD/VxjWcI5SRB0n6szE2A6s2fsNHDLO0nAlMHgfLQ=
github.com/jeremywohl/flatten/v2 v2.0.0-20211013061545-07e4a09fb8e4 h1:eA9wi6ZzpIRobvXkn/S2Lyw1hr2pc71zxzOPl7Xjs4w=
github.com/jeremywohl/flatten/v2 v2.0.0-20211013061545-07e4a09fb8e4/go.mod h1:s9g9Dfls+aEgucKXKW+i8MRZuLXT2MrD/WjYpMnWfOw=
github.com/jessevdk/go-flags v1.4.0/go.mod h1:4FA24M0QyGHXBuZZK/XkWh8h0e1EYbRYJSGM75WSRxI=
@@ -1531,7 +1529,6 @@ github.com/lightstep/lightstep-tracer-common/golang/gogo v0.0.0-20190605223551-b
github.com/lightstep/lightstep-tracer-go v0.18.1/go.mod h1:jlF1pusYV4pidLvZ+XD0UBX0ZE6WURAspgAczcDHrL4=
github.com/linkedin/goavro/v2 v2.12.0 h1:rIQQSj8jdAUlKQh6DttK8wCRv4t4QO09g1C4aBWXslg=
github.com/linkedin/goavro/v2 v2.12.0/go.mod h1:KXx+erlq+RPlGSPmLF7xGo6SAbh8sCQ53x064+ioxhk=
-github.com/linuxkit/virtsock v0.0.0-20201010232012-f8cee7dfc7a3/go.mod h1:3r6x7q95whyfWQpmGZTu3gk3v2YkMi05HEzl7Tf7YEo=
github.com/logrusorgru/aurora v0.0.0-20181002194514-a7b3b318ed4e/go.mod h1:7rIyQOR62GCctdiQpZ/zOJlFyk6y+94wXzv6RNZgaR4=
github.com/logzio/azure-monitor-metrics-receiver v1.0.0 h1:TAzhIZL2ueyyc81qIw8FGg4nUbts4Hvc3oOxSobY1IA=
github.com/logzio/azure-monitor-metrics-receiver v1.0.0/go.mod h1:UIaQ7UgxZ8jO3L0JB2hctsHFBbZqL6mbxYscQAeFpl4=
The patch didn't apply cleanly for me (I copied the patch, saved as a file, and patched), so I applied it by hand. I think I got it right but I have the same two failures.
But then I decided to just specify go 1.19 in go.mod so I could run go mod tidy locally.