influxdb icon indicating copy to clipboard operation
influxdb copied to clipboard

scraper: add scraper interval option

Open zhulongcheng opened this issue 6 years ago • 13 comments

https://github.com/influxdata/influxdb/blob/a64c4fd138ef11336de84b4bd42bc0f49b7d8d1d/cmd/influxd/launcher/launcher.go#L587-L591

Currently, scraper scheduler interval hard code to 10*time.Second, and all scrapers share the same interval.

Should each scraper has an interval option?

cc @russorat

zhulongcheng avatar Mar 28 '19 03:03 zhulongcheng

@zhulongcheng yes they should. thanks for reporting!

russorat avatar Mar 28 '19 17:03 russorat

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jul 22 '19 23:07 stale[bot]

don't close

zhulongcheng avatar Jul 24 '19 11:07 zhulongcheng

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Oct 30 '19 22:10 stale[bot]

This issue has been automatically closed because it has not had recent activity. Please reopen if this issue is still important to you. Thank you for your contributions.

stale[bot] avatar Nov 06 '19 22:11 stale[bot]

This would be a very useful feature. +1

NF1198 avatar Oct 08 '20 05:10 NF1198

@NF1198 have you tired configuring a Flux Task to do the scraping using the https://docs.influxdata.com/influxdb/v2.0/reference/flux/stdlib/experimental/prometheus/scrape/ function?

russorat avatar Oct 08 '20 16:10 russorat

This would be helpful. Some data sources I wrote scrapers for, take more than 10s to respond and some provide only data based on an interval of a minute.

gg-g avatar May 10 '21 05:05 gg-g

Really needed +1

@NF1198 have you tired configuring a Flux Task to do the scraping using the https://docs.influxdata.com/influxdb/v2.0/reference/flux/stdlib/experimental/prometheus/scrape/ function?

Tried this, but wasn't able to get it working...

Nevertheless, scrapers are just a very very easy to setup and use... only an interval option is missing

scrobbleme avatar Jul 19 '21 13:07 scrobbleme

Using a scraper would be useful if the interval could be set. My data takes longer than 10 seconds to GET. And if your data doesn't change or update this often it's extra data not needed and workloads used that could be moved elsewhere.

d0rkster avatar Aug 21 '21 13:08 d0rkster

Would be a very useful feature! +1

alehmann76 avatar Apr 24 '22 11:04 alehmann76

Are there any updates? Looking for methods to set scraping-interval per scrapper

niyantaz avatar Aug 01 '22 06:08 niyantaz

Would be super useful to be able to set intervals, this would allow us to create Prometheus API endpoints in front of everything, even old PDUs, terminals etc. Just the 10-second interval can be a bit much for these dated types of equipment and at the same time it also saves processing time, reducing hardware requirements for the Influx server.

ebc-conscia avatar Aug 08 '22 11:08 ebc-conscia

+1 for the interval setting on scrapers. I've tried the feature and removed these because 20sec interval was too much for this specific case. Without the interval I won't be using this feature at all. I've migrated to a task now and that works, for anyone (@scrobbleme ) interested in the workaround for this issue:

  1. create notebook

  2. build a query like:

import "experimental/prometheus"

prometheus.scrape(url: "http://192.168.x.x:9092/metrics")

|> to(bucket: "bucketname")
  1. press + to add a task action

  2. fill in cron interval

  3. save

peterpeerdeman avatar Oct 28 '22 13:10 peterpeerdeman

@peterpeerdeman Yeah, I've already migrated to tasks too. They are working now ...

I'm monitoring many WordPress sites, so my task template looks like this:

import "experimental/prometheus"

option task = {name: "example.com", every: 15m}

baseUrl = "https://example.com"
prometheusKey = "A-RANDOM-SECRET-KEY"
hostingVendor = "MyHoster"
hostingRate = "Webhosting"

url =
    baseUrl + "/wp-json/metrics?all=yes" + "&prometheus=" + prometheusKey + "&label_hosting-vendor=" + hostingVendor
        +
        "&label_hosting-rate=" + hostingRate

prometheus.scrape(url: url)
    |> to(orgID: "myOrgID", bucket: "WP Performance Measurement")

scrobbleme avatar Oct 28 '22 13:10 scrobbleme