agent icon indicating copy to clipboard operation
agent copied to clipboard

Allow disabling report usage when using the Operator

Open tpaschalis opened this issue 1 year ago • 2 comments

Currently, we can't use the Agent's command-line flags when deploying it via the Operator. This means that there's no easy way to disable the usage report to stats.grafana.org (normally done via the -disable-reporting flag).

We could provide a way for users to do that; I think a new flag on the agent-operator binary could be a decent option but I'm not sure if we have any other alternatives in mind.

tpaschalis avatar Jul 06 '22 15:07 tpaschalis

related: https://github.com/grafana/agent/issues/1798

marctc avatar Jul 14 '22 12:07 marctc

We could provide a way for users to do that; I think a new flag on the agent-operator binary could be a decent option but I'm not sure if we have any other alternatives in mind.

If we don't add a specific option in GrafanaAgent for disabling report usage, then I think a flag like this makes sense; something to pass generic extra flags to generated agent pods for some forward compatibility with new flags we might add.

rfratto avatar Jul 20 '22 14:07 rfratto

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed in 7 days if there is no new activity. Thank you for your contributions!

github-actions[bot] avatar Aug 20 '22 00:08 github-actions[bot]

Grafana Agent bombarding my local DNS server with thousand requests every single day. This is the result of one Grafana Agent running in the network. Block percentage was at 4.5% before updating the agent. Bildschirmfoto 2022-09-02 um 23 58 20

Seriously, if you want to send telemetry, at least make sure that it fails silently instead of retrying dozens of times. Bildschirmfoto 2022-09-03 um 00 02 41

Sorry to say this, but I don't find this acceptable.

Kovah avatar Sep 02 '22 22:09 Kovah

@Kovah Please be kinder to the maintainers. We're all humans and mistakes will be made. Obviously hammering your DNS server wasn't a deliberate design goal. That being said, I'm sorry you've run into this and it's something we'll investigate and fix.

This is the type of thing that should be its own issue, and I'll reference your comment in a new one for you. This is probably a bug somewhere in the code, since we currently configure a limit of 5 retries with exponential backoff and only send usage stats once per four hours. We'll have to investigate if there's any default retries for DNS lookups.

rfratto avatar Sep 05 '22 02:09 rfratto

After upgrading the Grafana Agent to the latest version that was released yesterday, I still see it doing hundreds of requests to stats.grafana.org every hour. Most of the time 5 requests are quickly sent within one second.

Is there any configuration that I am doing wrong? I tried the official Docker image with the following commands but none works:

docker run grafana/agent:latest -disable-reporting --config.file=/etc/agent/agent.yaml --metrics.wal-directory=/etc/agent/data
docker run grafana/agent:latest --disable-reporting --config.file=/etc/agent/agent.yaml --metrics.wal-directory=/etc/agent/data

The help for the agent says that it should be possible with this command line flag:

Usage of /bin/agent:
  -config.file string
    	configuration file to load
  ...
  -disable-reporting
    	Disable reporting of enabled feature flags to Grafana.

Reading though the latest pull request made by @marctc doesn't really seem to apply, at least the Agent complains that there is no disableReporting setting no matter where i put it in the configuration file. 😕

Kovah avatar Sep 13 '22 22:09 Kovah

Passing -disable-reporting or --disable-reporting as an argument should do it. I'm working on reproducing now and I'll get back to you soon.

In the meantime, are there any log messages from the agent that say failed to send usage report? That might help us track down why you're seeing so many DNS requests when the usage tracker is running.

Additionally, are you seeing a huge number of log lines saying reporting agent stats? That shouldn't be appearing more often than once a minute, and as infrequently as once every 4 hours if the report doesn't fail to send.

rfratto avatar Sep 13 '22 22:09 rfratto

Something weird is going on here.

I ran this:

$ docker run grafana/agent:latest -disable-reporting --config.file=/etc/agent/agent.yaml --metrics.wal-directory=/etc/agent/data

And I'm not seeing any log lines saying running usage stats reporter. I do see that if I don't pass the -disable-reporting flag. Something else might be trying to make a request to stats.grafana.org. I'm looking into that now.

rfratto avatar Sep 13 '22 22:09 rfratto

@Kovah here's what I tried:

I ran the latest Agent Docker image with usage stats disabled, exposing its HTTP server to my host machine:

$ docker run -p 12345:12345 grafana/agent:latest -disable-reporting --config.file=/etc/agent/agent.yaml --metrics.wal-directory=/etc/agent/data -server.http.address=0.0.0.0:12345

Then, on my host machine, I navigated to http://localhost:12345/debug/pprof/goroutine?debug=1 to see which goroutines were running to ensure that usagestats were not active.

I didn't see anything related to usagestats, and no other goroutines which would imply sending data to stats.grafana.org.

Are all your running agents setting the -disable-reporting flag now? Is anything else overriding the command line flags which would prevent them from being set?

Reading though the latest pull request made by @marctc doesn't really seem to apply, at least the Agent complains that there is no disableReporting setting no matter where i put it in the configuration file. 😕

For context, Marc's PR applied specifically to Grafana Agent Operator, which you don't appear to be running. When the usage stats were first created in Grafana Agent, we forgot to expose the flag as a setting in one of the operator's CRDs, which his PR resolved.

rfratto avatar Sep 13 '22 22:09 rfratto

Many thanks for the fast response!

I searched through the last days of the agent logs, but there is not a single line about the reporting. Neither if its enabled or disabled, or that it failed to send reports.

I double checked my Grafana setup (the main application), but the admin settings say that reporting is disabled: Bildschirmfoto 2022-09-14 um 00 33 15

It really is weird. I'll try to recreate both Docker containers, maybe something went wrong when updating the apps.

Is anything else overriding the command line flags which would prevent them from being set?

Is there a config that could lead to this? The agent has only the posted cli flags and the config file which contains the scrape settings.

Kovah avatar Sep 13 '22 22:09 Kovah

Is there a config that could lead to this? The agent has only the posted cli flags and the config file which contains the scrape settings.

No, not at the agent level. My thinking was maybe the way you're running Docker containers might be overriding the arguments to the container, but it sounds like that's probably not the case.

Are you running any other Grafana projects? Loki and Mimir have their own usage stats too, I believe.

rfratto avatar Sep 13 '22 22:09 rfratto

Also run "docker run grafana/agent:latest --version" to double-check the version.

mattdurham avatar Sep 13 '22 22:09 mattdurham

Are you running any other Grafana projects? Loki and Mimir have their own usage stats too, I believe.

Yeah... case solved. It was Loki. Loki the god of trickery. 😆 Honestly, I did not expect Loki to send something to stats.grafana.org, but now it makes a lot of sense. I have set the corresponding config flag and the requests stopped.

Thank you so much for the help! Really appreciate it. 🙏

Kovah avatar Sep 13 '22 22:09 Kovah

Glad to hear you got it all sorted out! If you need any more help please feel free to reach out in a new issue :)

rfratto avatar Sep 13 '22 23:09 rfratto