grafana-unraid-stack icon indicating copy to clipboard operation
grafana-unraid-stack copied to clipboard

influxdb and telegraf crashing after 1/27/23 update

Open m-bongio opened this issue 3 years ago • 19 comments

Good morning,

Overall, huge shout out and thank you for creating this...I love the visual view into how the server to doing, and how easy this container made it to setup. After updating this morning, I noticed that it isn't displaying any data (it was fine moments before I updated it), and then noticed that it appears that influxdb and telegraf crashing. Any suggestions on how to fix this? Below is the log.

text error warn system array login

[info] Initialisation started... [info] influxdb fixed. [info] loki fixed. [info] telegraf fixed. [info] promtail fixed. [info] grafana fixed. [info] Initialisation complete

[info] Runing apps... [info] Run influxdb as service on port 8086 Executable /usr/bin/influxd does not exist! [info] Run loki as daemon on port 3100 [info] Run telegraf as service [info] Run promtail as daemon on port 9086 [info] Run grafana as service on port 3006

  • Starting Grafana Server ...done. [info] All done

[error] influxdb crashed! [info] loki PID: 60 [info] Skip hddtemp due to USE_HDDTEMP set to no [error] telegraf crashed! [info] promtail PID: 75 [info] grafana PID: 91

[info] Initialisation started... [info] influxdb fixed. [info] loki fixed. [info] telegraf fixed. [info] promtail fixed. [info] grafana fixed. [info] Initialisation complete

[info] Runing apps... [info] Run influxdb as service on port 8086 Executable /usr/bin/influxd does not exist! [info] Run loki as daemon on port 3100 [info] Run telegraf as service [info] Run promtail as daemon on port 9086 [info] Run grafana as service on port 3006

  • Starting Grafana Server ...done. [info] All done

[error] influxdb crashed! [info] loki PID: 60 [info] Skip hddtemp due to USE_HDDTEMP set to no [error] telegraf crashed! [info] promtail PID: 81 [info] grafana PID: 104_

m-bongio avatar Jan 27 '23 14:01 m-bongio

Getting the same here

rhcp011235 avatar Jan 27 '23 20:01 rhcp011235

Seems to be a general problem. Will there be a fix? Thanks and best regards!

skyn3t1337 avatar Jan 28 '23 05:01 skyn3t1337

Same problem here. Hope, that there will be soon a fix?

ZoXx avatar Jan 28 '23 08:01 ZoXx

Looks like there was another update this morning, but influxes and telegraf still crashed. I just rolled back to testdasi/grafana-unraid-stack:s230122 which is working fine for me.

m-bongio avatar Jan 28 '23 15:01 m-bongio

I report the same problem

juan11perez avatar Jan 29 '23 13:01 juan11perez

Same here...

I suppose that it is time to build the stack myself out of all the components...

P6g9YHK6 avatar Jan 29 '23 23:01 P6g9YHK6

Same here...

I suppose that it is time to build the stack myself out of all the components...

Don't think that will help you really, because then you need to keep track of all the version compatibility yourself. It's a bummer you can't really see a Changelog in Unraid for any updates (can you?)..

SebaGnich avatar Feb 01 '23 13:02 SebaGnich

another update today and it still crashes

rhcp011235 avatar Feb 14 '23 17:02 rhcp011235

Same issue. Reverted to s230122 as previously mentioned and I am good to go now.

bubba925 avatar Feb 28 '23 15:02 bubba925

Tried latest from March 14 and still crashed. Reverting to s230122 did the trick.

yorch avatar Mar 22 '23 22:03 yorch

I put a PR up with the fix, the cert for InfluxData changed. It just needs a merge.

bobbo489 avatar Apr 03 '23 04:04 bobbo489

@bobbo489 can you please put the link to the PR?

yorch avatar Apr 03 '23 11:04 yorch

I updated it in the static-ubuntu package, since this one everything is marked in the deprecated folder.

https://github.com/testdasi/static-ubuntu/pull/1

bobbo489 avatar Apr 05 '23 19:04 bobbo489

same error with a clean install here.

[info] Run influxdb as service on port 8086 Executable /usr/bin/influxd does not exist!

fapo85 avatar Apr 10 '23 16:04 fapo85

Reverting to s230122 work!

aslcmowmaejfo avatar Apr 25 '23 14:04 aslcmowmaejfo

where merge? :\

Flummi avatar Apr 28 '23 18:04 Flummi

I reverted to s230122 but after some time (and also clean install) again everything broke..

SebaGnich avatar May 23 '23 18:05 SebaGnich

Adding to this. Not entirely sure when it broke. It's been a a week or so since I last looked at my dashboard, but last night I noticed it wouldn't connect and when I looked at the logs I saw the same error:

Tried multiple versions, all broken.

  • latest
  • s230324
  • s230312
  • even s230122

Clean installing the container didn't help either. Feels like data is corrupted maybe, but I really don't want to have to recreate 4 dashboards by doing a proper clean install so I'm going to attempt to restore a a backup from last week and try again.

Edit:
Restoring a previous backup did not help.

dlchamp avatar Jun 24 '23 11:06 dlchamp

Managed to get it up and running after deleting the entire appdata.

What I noticed is that there were a bunch of files for healthcheck-failure (more than 100.000 files). image

I couldn't even run ls inside the Grafana-Unraid-Stack directory.

Maybe these healthcheck files need to be periodically cleaned or refactored so that this doesn't happen in the future?

Edit: It seems to be creating them again every minute

image

probably related to PR#1 which needs to be merged

stefan-matic avatar Apr 30 '24 13:04 stefan-matic