k6 icon indicating copy to clipboard operation
k6 copied to clipboard

Switch to the InfluxDB 2 Go client library

Open na-- opened this issue 4 years ago • 26 comments

Prompted by this community forum question, I noticed that we're currently using the old InlfuxDB library (notice influxdb1-client): https://github.com/loadimpact/k6/blob/6081e6a3ff488990aab3fa349a84521cd495a1b3/stats/influxdb/collector.go#L30

https://github.com/influxdata/influxdb-client-go appears to be the new version, from its readme:

This repository contains the reference Go client for InfluxDB 2. Note: Use this client library with InfluxDB 2.x and InfluxDB 1.8+ (see details). For connecting to InfluxDB 1.7 or earlier instances, use the influxdb1-go client library.

InfluxDB v1.8.0 seems to have been released on 2020-04-13, so it might be a bit too early to switch to the new Go library, considering that InfluxDB 2 should be somewhat backwards compatible, but in a few months it will probably be fine.

na-- avatar Nov 19 '20 08:11 na--

Looking at the new and old it looks like there seems to be very little reason to upgrade, unless someone has multiple orgs ...

We could also drop the dependency and just do the requests on our own, we do 2 requests that are pure HTTP and will let us try to optimize the marshaling or anything else depending on our types. And also we won't wait on the dependency to add basic functionality .. that then I forgot to use it :facepalm:

mstoykov avatar Nov 19 '20 08:11 mstoykov

From my tests with v2 a few months ago we weren't able to write to it, though I forget the exact reason... The line protocol format seems to have remained backwards compatible, but there are some conceptual differences with org support and more robust auth in v2 that make it incompatible with v1 clients. Plus the switch from InfluxQL to Flux which shouldn't matter for us, but it hints that they're fundamentally different.

imiric avatar Nov 19 '20 09:11 imiric

I'd live to see InfluxDB v2 support as well. It's a big upgrade, and has the data explorer & dashboard built-in so no need for Grafana. It would make consuming k6.io data much easier

benc-uk avatar Dec 07 '20 17:12 benc-uk

I saw that authentication is different in v2, it would be awesome to have InfluxDB v2 support and take advantage of flux features

R0m3rCh avatar Dec 07 '20 21:12 R0m3rCh

We are already using influxdb v2, and building a new v1 environment for k6 reporting is a pain in the ass for us, while the v2 integrated panel is sufficient for us.

li-zhixin avatar Mar 02 '21 03:03 li-zhixin

I think this is something we should be able to do for k6 v0.32.0. We refactored the output interface recently (https://github.com/loadimpact/k6/pull/1869), to fix a bunch of issues and to make writing new outputs easier and output extensions possible. So far, only the json output has been moved to the new interface, though I also opened a PR (https://github.com/loadimpact/k6/pull/1874) for moving the cloud output yesterday that may or may not go into k6 v0.31.0 next week. All of the other outputs have wrappers that adapt them to the new interface without changing their code yet, but we plan to move them and completely retire the old interface in k6 v0.32.0.

In any case, support for output extensions with https://github.com/k6io/xk6 for sure will be released in k6 v0.31.0 next week. This should allow anyone to make an extensions with influxdb2 support almost immediately. And I am not promising anything for sure, but when we're moving the influxdb output from the old Collector interface to the new Output interface, we probably could move it to the new version as well, if it's easy...

na-- avatar Mar 02 '21 08:03 na--

This is fantastic, will there be a guide to using xk6 for creating an output extension, even if it's a simple example? I'm wondering if it will be possible to create an extension that outputs the metrics with an API rather than into a file e.g. for this https://github.com/loadimpact/k6/issues/1875

benc-uk avatar Mar 02 '21 08:03 benc-uk

will there be a guide to using xk6 for creating an output extension, even if it's a simple example?

There probably will be a page in the docs or maybe a blog post after we release k6 v0.31.0 (hopefully next week). For now, even this guide for JS extensions should suffice with a bit of work: https://k6.io/blog/extending-k6-with-xk6 For output extensions, instead of calling k6/js/modules.Register() to register your JS extension, you call this function to register your Output extension: https://github.com/loadimpact/k6/blob/82cab8b693963659d5f0cfb4cbf785c792577a0f/output/extensions.go#L45-L47

Where Output is the interface your extension is required to implement, and Params is a fat struct you can use for initialization, both of which are defined here: https://github.com/loadimpact/k6/blob/master/output/types.go

This is a complete example of an output that implements the new interface, though as I linked above, the cloud one is also WIP: https://github.com/loadimpact/k6/pull/1874.

na-- avatar Mar 02 '21 12:03 na--

As a workaround for this I can recommend using telegraf with a config like:

# Configuration for telegraf agent
[agent]
  ## Default data collection interval for all inputs
  interval = "10s"
  ## Rounds collection interval to 'interval'
  ## ie, if interval="10s" then always collect on :00, :10, :20, etc.
  round_interval = true

  ## Telegraf will send metrics to outputs in batches of at most
  ## metric_batch_size metrics.
  ## This controls the size of writes that Telegraf sends to output plugins.
  metric_batch_size = 5000

  ## Maximum number of unwritten metrics per output.  Increasing this value
  ## allows for longer periods of output downtime without dropping metrics at the
  ## cost of higher maximum memory usage.
  metric_buffer_limit = 100000

  ## Collection jitter is used to jitter the collection by a random amount.
  ## Each plugin will sleep for a random time within jitter before collecting.
  ## This can be used to avoid many plugins querying things like sysfs at the
  ## same time, which can have a measurable effect on the system.
  collection_jitter = "0s"

  ## Default flushing interval for all outputs. Maximum flush_interval will be
  ## flush_interval + flush_jitter
  flush_interval = "1s"
  ## Jitter the flush interval by a random amount. This is primarily to avoid
  ## large write spikes for users running a large number of telegraf instances.
  ## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
  flush_jitter = "0s"

  ## By default or when set to "0s", precision will be set to the same
  ## timestamp order as the collection interval, with the maximum being 1s.
  ##   ie, when interval = "10s", precision will be "1s"
  ##       when interval = "250ms", precision will be "1ms"
  ## Precision will NOT be used for service inputs. It is up to each individual
  ## service input to set the timestamp at the appropriate precision.
  ## Valid time units are "ns", "us" (or "µs"), "ms", "s".
  precision = ""

  ## Log at debug level.
  debug = true
  ## Log only error level messages.
  # quiet = false

  ## Log target controls the destination for logs and can be one of "file",
  ## "stderr" or, on Windows, "eventlog".  When set to "file", the output file
  ## is determined by the "logfile" setting.
  logtarget = "stderr"

  ## Name of the file to be logged to when using the "file" logtarget.  If set to
  ## the empty string then logs are written to stderr.
  # logfile = ""

  ## The logfile will be rotated after the time interval specified.  When set
  ## to 0 no time based rotation is performed.  Logs are rotated only when
  ## written to, if there is no log activity rotation may be delayed.
  # logfile_rotation_interval = "0d"

  ## The logfile will be rotated when it becomes larger than the specified
  ## size.  When set to 0 no size based rotation is performed.
  # logfile_rotation_max_size = "0MB"

  ## Maximum number of rotated archives to keep, any older logs are deleted.
  ## If set to -1, no archives are removed.
  # logfile_rotation_max_archives = 5

  ## Override default hostname, if empty use os.Hostname()
  hostname = ""
  ## If set to true, do no set the "host" tag in the telegraf agent.
  omit_hostname = false


###############################################################################
#                            OUTPUT PLUGINS                                   #
###############################################################################

# Configuration for sending metrics to InfluxDB
[[outputs.influxdb_v2]]
  ## The URLs of the InfluxDB cluster nodes.
  ##
  ## Multiple URLs can be specified for a single cluster, only ONE of the
  ## urls will be written to each interval.
  ##   ex: urls = ["https://us-west-2-1.aws.cloud2.influxdata.com"]
  urls = ["http://127.0.0.1:8086"]

  ## Token for authentication.
  token = "WtcQX31NksChPXruKd0uHV-yZoS8LA9UqoykB2PbXa7hw4cdXELwiylFCOe1pHMCPYLkOFtAFAAAhX2Fv7QOxg=="

  ## Organization is the name of the organization you wish to write to; must exist.
  organization = "m.org"

  ## Destination bucket to write into.
  bucket = "m.bucket"

  ## The value of this tag will be used to determine the bucket.  If this
  ## tag is not set the 'bucket' option is used as the default.
  # bucket_tag = ""

  ## If true, the bucket tag will not be added to the metric.
  # exclude_bucket_tag = false

  ## Timeout for HTTP messages.
  # timeout = "5s"

  ## Additional HTTP headers
  # http_headers = {"X-Special-Header" = "Special-Value"}

  ## HTTP Proxy override, if unset values the standard proxy environment
  ## variables are consulted to determine which proxy, if any, should be used.
  # http_proxy = "http://corporate.proxy:3128"

  ## HTTP User-Agent
  # user_agent = "telegraf"

  ## Content-Encoding for write request body, can be set to "gzip" to
  ## compress body or "identity" to apply no encoding.
  content_encoding = "gzip"

  ## Enable or disable uint support for writing uints influxdb 2.0.
  # influx_uint_support = false

  ## Optional TLS Config for use on HTTP connections.
  # tls_ca = "/etc/telegraf/ca.pem"
  # tls_cert = "/etc/telegraf/cert.pem"
  # tls_key = "/etc/telegraf/key.pem"
  ## Use TLS but skip chain & host verification
  # insecure_skip_verify = false


 # Accept metrics over InfluxDB 1.x HTTP API
 [[inputs.influxdb_listener]]
   ## Address and port to host InfluxDB listener on
   service_address = ":8186"

   ## maximum duration before timing out read of the request
   read_timeout = "10s"
   ## maximum duration before timing out write of the response
   write_timeout = "10s"

   ## Maximum allowed HTTP request body size in bytes.
   ## 0 means to use the default of 32MiB.
   max_body_size = "32MiB"
   
   # you can uncomment that to only get this 2 metrics
   # namepass=["http_req_duration", "vus"]

Obviously, URL/ports, the token, organization, and the bucket need to be configured.

With the added bonus that the namepass option (the last line) can be used to drop metrics that you are not interested kind of like what #1321 will do

mstoykov avatar Mar 11 '21 16:03 mstoykov

I tried to develop an extension based on the above information to write the output of k6 into the v2 version of influxDB, which should solve the problem

xk6-influxdbv2

If there is any help, please help me with a star, thanks!

li-zhixin avatar Mar 12 '21 09:03 li-zhixin

I tried to develop an extension based on the above information to write the output of k6 into the v2 version of influxDB, which should solve the problem

xk6-influxdbv2

If there is any help, please help me with a star, thanks!

In addition, for go, I am a newbie, any suggestions and improvements are welcome

li-zhixin avatar Mar 12 '21 09:03 li-zhixin

Awesome :tada: I was about to point out that you shouldn't make network requests in AddMetricSamples(), but then I saw that the new InfluxDB SDK's WriteAPI works asynchronously and batches writes! This is a great improvement compared to the previous version, where we had to essentially re-implement these things in the output code! :tada: :confetti_ball: :fireworks:

na-- avatar Mar 12 '21 10:03 na--

Thank you very much for the suggestion, @na-- . I'm still a bit confused, you don't recommend sending network requests in the AddMetricSamples() method, does this mean that the data needs to be written in the Stop() method, or do you just add the data to a collection in this method, and then poll the collection in a background thread to write in batches?

li-zhixin avatar Mar 12 '21 10:03 li-zhixin

Ah, sorry, I didn't mean that you had to change your code. From what I can see, it would work alright, because the InfluxDB v2 SDK batches the metrics and sends them in the background, asynchronously. So your code should be fine as it is.

The v1 SDK didn't have such capabilities, so we had to implement them ourselves, on the k6 side. We have to do them for other output types as well, for example see the JSON output. That's why I created the SampleBuffer and PeriodicFlusher helpers, to make such outputs easier to implement, but you don't need them in your extension because the InfluxDB v2 SDK already does something very similar.

na-- avatar Mar 12 '21 10:03 na--

Thanks again for your reply, I have figured out what you want to say. Go's chan is very cool and I can already feel its power by simply reading your code. The mechanism of message communication to avoid race conditions is a bit similar to AKKA's Actor pattern. Have a great time!

li-zhixin avatar Mar 12 '21 10:03 li-zhixin

Has there been any timeline decision on when influxdb 2 support will be added?

thecodejunkie avatar May 13 '21 14:05 thecodejunkie

@thecodejunkie, sorry, we didn't manage to get this in k6 v0.32.0. We thought it wasn't very urgent, since it seemed @li-zhixin had a workable extension, but that seems to have disappeared :confused: We'll try to fix it later in the v0.33.0 cycle, but if anyone wants to tackle it sooner, please say so and feel free to make a k6 PR (influxdb code is here) or another output extension!

na-- avatar May 14 '21 06:05 na--

For those who'd like to use the Telegraf config from @MStoykov in https://github.com/k6io/k6/issues/1730#issuecomment-796869067, I've got it working with this setup:

docker-compose.yml

version: '3.8'

services: 
    influxdb:
        image: influxdb:2.0.6-alpine
        environment:
            DOCKER_INFLUXDB_INIT_MODE: setup
            DOCKER_INFLUXDB_INIT_USERNAME: username
            DOCKER_INFLUXDB_INIT_PASSWORD: password
            DOCKER_INFLUXDB_INIT_ORG: org
            DOCKER_INFLUXDB_INIT_BUCKET: data
            DOCKER_INFLUXDB_INIT_ADMIN_TOKEN: very-secret-token
        volumes:
            - influxdb-config:/etc/influxdb2
            - influxdb-data:/var/lib/influxdb2
        networks: 
            - data

    telegraf:
        image: telegraf:1.18.2-alpine
        volumes:
            - ./docker/telegraf/telegraf.conf:/etc/telegraf/telegraf.conf
        networks:
            - data

volumes: 
    influxdb-config:
    influxdb-data:

networks: 
    data:
        driver: bridge

./docker/telegraf/telegraf.conf

[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 5000
  metric_buffer_limit = 100000
  collection_jitter = "0s"
  flush_interval = "1s"
  flush_jitter = "0s"
  precision = ""
  debug = true
  logtarget = "stderr"
  hostname = ""
  omit_hostname = false

[[outputs.influxdb_v2]]
  urls = ["http://influxdb:8086"]
  token = "very-secret-token"
  organization = "org"
  bucket = "data"
  content_encoding = "gzip"

[[inputs.influxdb_listener]]
  service_address = ":8186"
  read_timeout = "10s"
  write_timeout = "10s"
  max_body_size = "32MiB"

ivodvb avatar May 17 '21 09:05 ivodvb

@thecodejunkie, sorry, we didn't manage to get this in k6 v0.32.0. We thought it wasn't very urgent, since it seemed @li-zhixin had a workable extension, but that seems to have disappeared 😕 We'll try to fix it later in the v0.33.0 cycle, but if anyone wants to tackle it sooner, please say so and feel free to make a k6 PR (influxdb code is here) or another output extension!

Sorry about this, the repository is now back

li-zhixin avatar May 17 '21 11:05 li-zhixin

@li-zhixin for some reason I can't get your extension to work. I just get this error:

ERRO[0001] invalid output type 'influxdbv2', available types are: cloud, csv, datadog, influxdb, json, kafka, statsd

When I try to run a workflow with the --out influxdbv2 switch. :(

standoubtayx avatar Jul 09 '21 08:07 standoubtayx

Using the extension provided by @li-zhixin fixed my issues with k6 disconnecting from telegraf influx listener when the number of vus increases. Really looking forward to official support.

BrynCooke avatar Aug 28 '21 18:08 BrynCooke

We have a PR that adds support for InfluxDB v2 natively in k6, if someone wants to try it: https://github.com/grafana/k6/pull/2110

Unfortunately, we didn't have enough time to test and benchmark it properly for it to go in k6 v0.34.0, so we won't merge it right now... :disappointed: We'll instead make it into an "official" xk6 extension in the coming days, so it's more easily usable right now and so we get some feedback from the community, and merge it to master in a week or two, so that it's released in k6 v0.35.0 in a couple of months.

na-- avatar Aug 30 '21 08:08 na--

Using the extension provided by @li-zhixin fixed my issues with k6 disconnecting from telegraf influx listener when the number of vus increases.

Hi @BrynCooke, can you add more details about the experienced issues? We would identify and check if they are still available in the upcoming release and in the new InfluxDB v2 PR.

codebien avatar Aug 30 '21 08:08 codebien

The setup was a gke cluster with separate nodes for k6 and the server under test. Telegraf is configured to send to grafana-cloud and also provide nodestats.

The test was a stress test where the number of vus would increase over time.

When running the test it would get to a certain number of vus (300 in my case) and then print something like: use of closed network connection" output=InfluxDBv. After this happens no more metrics are sent to telegraf.

The cluster nodes didn't look under stress at all.

BrynCooke avatar Aug 30 '21 10:08 BrynCooke

Hello, here for announcing that we’ve released a first version of an official InfluxDBv2 output extension: https://github.com/grafana/xk6-output-influxdb.

As you can see from some previous activities in this issue, we explored the original plan to include this feature directly in the k6 OSS core, however, it turns out that there were some critical issues from k6’s perspective:

  • The v2 has introduced some new concepts like Organization and mandatory security layer by the token. This would make an important impact on the k6’s user experience that relies on the InfluxDB v1 ease of use, (e.g. the auto-creation feature supported in v1 is not supported anymore).

  • The InfluxDB-client dependency that we use to implement this feature would increase the binary size of the k6 CLI of 3MB.

  • The v1 has some known problems in terms of performance and we expected some sensible improvements from the v2, instead, we didn’t see it. We need a better investigation on this topic for an accurate conclusion.

Considering those topics we decided that including the v2 directly in the k6 OSS core would require a more thorough assessment and potentially further changes. Instead, we thought it would fit better as a k6 extension, for now, giving us the freedom to experiment more and start from a clean starting point.

Currently, we don’t have a concrete future plan for integrating the InfluxDBv2 extension in the K6 OSS core, but we are not excluding it either. We will analyze the impact and feedback that we will get from the community and we will prioritize the future development accordingly to it. For this reason, we are keeping this issue open.

Feel free to try the extension and open feature requests or bug issues directly in the extension’s repository. Please, don’t use this issue for reporting problems with the extension.

codebien avatar Nov 09 '21 15:11 codebien

+1 to integrate it into CORE K6:

  • v1.8 is not longer actively supported - the world has moved on to v2
  • v2 has significant downsampling improvements - that alone is enough of a motivation (in v1 it changes all names and there is a LOT of hoops to jump through)
  • k6 is a great tool to stress test (and successfully kill influx) even when you're not planning to - simply due to the amount of data k6 pushes, which is staggering

kkriegkxs avatar Apr 28 '22 17:04 kkriegkxs

+1 to integrate it into CORE K6:

  • v1.8 is not longer actively supported - the world has moved on to v2
  • v2 has significant downsampling improvements - that alone is enough of a motivation (in v1 it changes all names and there is a LOT of hoops to jump through)
  • k6 is a great tool to stress test (and successfully kill influx) even when you're not planning to - simply due to the amount of data k6 pushes, which is staggering

InfluxDB3.0 is out and much faster and k6 is still hanging on the 1.8... It's about time to support this. or at least the support for Prometheus would also be very helpful.

AchimGrolimund avatar Aug 14 '23 19:08 AchimGrolimund

Hello,

We're not planning to switch to InfluxDB v2 client because it does not have support for v1 (as expected), and doing such thing would represent a breaking change for all the existing users. Also, because we don't have capacity to keep that support as part of the core of k6 (and keep track with all new releases), and we prefer to prioritize standards like Open Telemetry instead.

For those users looking for support for v2, there's an available extension: xk6-output-influxdb.

InfluxDB3.0 is out and much faster and k6 is still hanging on the 1.8...

For those users looking for support for v3, contributions (in form of extensions as well), are welcome! 😄


So, with all that being said, I proceed to close this issue, as I think there's no remaining action or pending work to do.

Thanks!

joanlopez avatar Mar 11 '24 13:03 joanlopez