mqtt-datasource icon indicating copy to clipboard operation
mqtt-datasource copied to clipboard

Closing Web Browser Causes DatasourceNoData Alert And Graph Data To Be Cleared (MQTT Datasource)

Open Drakynn opened this issue 2 years ago • 6 comments

What happened:

During the process of setting up my new Grafana dashboard, any edit involving the mqtt-datasource topic reset all of the collected data. When I was first just editing the dashboard panels, I assumed this was working "as intended" during the edit/save cycle. When I started work on the alarms, I thought it was odd that this also reset the data.

I took my system live yesterday and noticed last night that if I close the browser window that I was simply watching the dashboard from, my data would reset and a DatasourceNoData alarm condition was raised. There was no editing or no saving of panels. I repeated this several times from both of my computers with different browsers. In each case, I had the same result - data wipe and alarm generated.

After about a minute, the DatasourceNoData condition is resolved and normal data collection resumes.

What you expected to happen:

I expect to be able to close the browser without resetting all of my collected data and without receiving a no-data alarm via email. I should be able to observe the visualization without destroying the data or generating false alarms.

How to reproduce it (as minimally and precisely as possible):

I initially configured this in a VM, then moved it to a Raspberry Pi4. I followed the same steps both times with identical results.

  1. Install latest Grafana via apt
  2. Install Mosqutto MQTT broker via apt
  3. Build grafana/mqtt-datasource plugin from source
  4. Start live data collection from IoT device
  5. Create simple time series panel to start graphing data
  6. Set up an alarm threshold with No Data alarm
  7. Allow some data to collect
  8. Close web browser
  9. Wait a minute for no data alarm
  10. Re-open browser and see the previously collected data is gone

Anything else we need to know?:

Other than spurious data loss and associated alarms, all systems are working as expected.

grafana.log shows the following relevant entries, most of which are just the alarm cycle. The key entry seems to be the "stop streaming" at the moment the browser closes.

logger=context t=2022-03-17T08:38:08-0400 lvl=info msg="Request Completed" method=GET path=/ status=302 remote_addr=192.168.2.10 time_ms=0 size=29 referer=
logger=http.server t=2022-03-17T08:38:24.64-0400 lvl=info msg="Successful Login" User=admin@localhost
logger=context t=2022-03-17T08:38:25.22-0400 lvl=info msg="Request Completed" method=GET path=/api/live/ws status=0 remote_addr=192.168.2.10 time_ms=2 size=0 referer=
logger=plugin.grafana-mqtt-datasource t=2022-03-17T08:39:35.39-0400 lvl=info msg="stop streaming (context canceled)"
logger=alertmanager org=1 level=debug component=dispatcher msg="Received alert" alert=DatasourceNoData[24f10d8][active]
logger=alertmanager org=1 level=debug component=dispatcher aggrGroup="{}/{scope=\"house\"}:{}" msg=flushing alerts=[DatasourceNoData[24f10d8][active]]
logger=alertmanager org=1 level=debug component=dispatcher receiver="House Alert" integration=email[0] msg="Notify success" attempts=1
logger=alertmanager org=1 level=debug component=dispatcher aggrGroup="{}/{scope=\"house\"}:{}" msg=flushing alerts=[DatasourceNoData[24f10d8][resolved]]
logger=alertmanager org=1 level=debug component=dispatcher receiver="House Alert" integration=email[0] msg="Notify success" attempts=1

Environment: As noted, this was originally set up clean a few days ago using amd64 binaries in a VM rather than the arm64 packages for the Raspberry Pi. I'm confident the hardware platform is not the issue.

  • Grafana version: grafana/stable,now 8.4.3 arm64 (Edit: Not fixed with 8.4.4)
  • Data source type & version: grafana/mqtt-datasource (latest)
  • OS Grafana is installed on: Debian 11.2
  • User OS & Browser: macOS 10.15.7 and Win 11, Firefox, Chrome and Safari
  • Grafana plugins: only mqtt-datasource
  • Others: Mosquitto MQTT - mosquitto/stable, now 2.0.11-1 arm64

Drakynn avatar Mar 17 '22 13:03 Drakynn

I have transfered this issue from grafana repo as it seems related to this data source.

ivanahuckova avatar Mar 18 '22 11:03 ivanahuckova

In pkg/plugin/datasource.go, RunStream() calls Client.Unsubscribe if ctx.Done() - which is the state when a browser disconnects from the Grafana console.

pkg/plugin/datasource.go

func (ds *MQTTDatasource) RunStream(ctx context.Context, req *backend.RunStreamRequest, sender *backend.StreamSender) error {
	ds.Client.Subscribe(req.Path)
	defer ds.Client.Unsubscribe(req.Path)    // <--  

	for {
		select {
		case <-ctx.Done():
			backend.Logger.Info("stop streaming (context canceled)")
			return nil
		case message := <-ds.Client.Stream():
			if message.Topic != req.Path {
				continue
			}
			err := ds.SendMessage(message, req, sender)
			if err != nil {
				log.DefaultLogger.Error(fmt.Sprintf("unable to send message: %s", err.Error()))
			}
		}
	}
}

pkg/mqtt/client.go

func (c *Client) Unsubscribe(t string) {
	log.DefaultLogger.Debug(fmt.Sprintf("Unsubscribing from MQTT topic: %s", t))
	c.client.Unsubscribe(t)
	c.topics.Delete(t)
}

The topic is being explicitly deleted on an unsub, and unsub is being explicitly called on a client disconnect.

I'm not sure what other effects it has, but I found that with the deferred call to Client.Unsubscribe commented out, I'm getting the behaviour I expect from Grafana.

I am able to stop viewing a panel without the data being wiped. This means it's OK to go from the dashboard to editing alarms or even close the browser window entirely without data loss.

Drakynn avatar Mar 18 '22 14:03 Drakynn

I believe this also resolves issue #36 and #37 which seem to be variants of "data is lost when I close the browser".

Of course this comes with the caveat that there may be other error conditions which need to be caught separately which legitimately should unsub from the topic and delete it.

Drakynn avatar Mar 18 '22 14:03 Drakynn

Have used mqtt-datasource for a couple of weeks now without the Client_Unsubscribe() on client disconnect.

I'm enjoying being able to revisit dashboards without the data wipes. I've noticed only one minor artifact.

When returning to a dashboard that has not been displayed recently, the graph builds somewhat strangely. The MQTT data seems to be gobbled up from left to right with a series of rapid screen updates, leaving only the most recent value displayed on the far right. Refreshing the browser shows a current view of the most recent data which in my case is the last 6 hours. If the browser is left open the page updates normally.

Drakynn avatar Mar 31 '22 17:03 Drakynn

Here's a link to a 15 second video showing how the panels quickly seem to "eat" the old values. The video is recorded at 1X normal speed

First, the gauge rapidly bounces through all the values, then the graph seems to eat the old values, left to right, leaving only the most recent data point on the graph.

The browser is manually refreshed at the 11-second mark, which brings up the last 6 hours of data as it should have displayed on the original load.

https://www.dropbox.com/s/x9r1f656jjwj19u/mqtt-display-bug.mp4?dl=0

Drakynn avatar Apr 10 '22 02:04 Drakynn

Is this fixed in latest releases ? I still have the same problem. Close browser / open again and all graphs are zero.

sanitariu avatar Oct 01 '23 19:10 sanitariu