nrpe_exporter
nrpe_exporter copied to clipboard
exporter panics at irregular intervals due to rand.NewSource not being safe for concurrent use
I am using the nrpe_exporter (https://github.com/RobustPerception/nrpe_exporter) to make about 4500 nrpe client requests every 70s. A few times a day at irregular intervals (typically hours, but sometimes minutes) the nrpe_exporter crashes with the following error:
panic: runtime error: index out of range
goroutine 3888620 [running]:
math/rand.(*rngSource).Int63(0xc420083500, 0x5d3ea3c976019300)
/home/dprittie/.cache/pants/bin/go/linux/x86_64/1.8.3/go/go/src/math/rand/rng.go:231 +0x8c
math/rand.(*Rand).Int63(0xc420018c30, 0x5d3ea3c976019300)
/home/dprittie/.cache/pants/bin/go/linux/x86_64/1.8.3/go/go/src/math/rand/rand.go:81 +0x33
math/rand.(*Rand).Uint32(0xc420018c30, 0xc420765ae4)
/home/dprittie/.cache/pants/bin/go/linux/x86_64/1.8.3/go/go/src/math/rand/rand.go:84 +0x2b
nrpe.randomizeBuffer(0xc4208e4480, 0x40c, 0x40c)
/opt/teamcity-agent-01/work/96ee0984d4e5ff87/.pants.d/compile/go/src.go.src.nrpe.nrpe/src/nrpe/nrpe.go:110 +0x52
nrpe.buildPacket(0xc400000001, 0xc420765bb0, 0x8, 0x20, 0x8)
/opt/teamcity-agent-01/work/96ee0984d4e5ff87/.pants.d/compile/go/src.go.src.nrpe.nrpe/src/nrpe/nrpe.go:202 +0x208
nrpe.Run(0xcce260, 0xc4202ac0f0, 0xc4202cce54, 0x8, 0x0, 0x0, 0x0, 0x1, 0x0, 0x0, ...)
/opt/teamcity-agent-01/work/96ee0984d4e5ff87/.pants.d/compile/go/src.go.src.nrpe.nrpe/src/nrpe/nrpe.go:282 +0x10f
main.collectCommandMetrics(0xc4202cce54, 0x8, 0x0, 0x0, 0x0, 0xcce140, 0xc420b300c8, 0xcc53a0, 0xc420123620, 0xc4208d7670, ...)
/opt/teamcity-agent-01/work/96ee0984d4e5ff87/.pants.d/compile/go/src.go.src.nrpe_exporter.nrpe_exporter/src/nrpe_exporter/nrpe_exporter.go:49 +0x120
main.(*Collector).Collect(0xc420596b90, 0xc4202cd080)
/opt/teamcity-agent-01/work/96ee0984d4e5ff87/.pants.d/compile/go/src.go.src.nrpe_exporter.nrpe_exporter/src/nrpe_exporter/nrpe_exporter.go:72 +0x357
github.com/prometheus/client_golang/prometheus.(*Registry).Gather.func2(0xc4201bc5e0, 0xc4202cd080, 0xcc89a0, 0xc420596b90)
/opt/teamcity-agent-01/work/96ee0984d4e5ff87/.pants.d/compile/go/3rdparty.go.github.com.prometheus.client_golang.prometheus/src/github.com/prometheus/client_golang/prometheus/registry.go:433 +0x61
created by github.com/prometheus/client_golang/prometheus.(*Registry).Gather
/opt/teamcity-agent-01/work/96ee0984d4e5ff87/.pants.d/compile/go/3rdparty.go.github.com.prometheus.client_golang.prometheus/src/github.com/prometheus/client_golang/prometheus/registry.go:434 +0x2ec
Note the line numbers may be a little off, because we have applied a couple of patches to suit our local environment.
I believe we see this error because rand.NewSource is not safe for concurrent use, which I found documented at https://pkg.go.dev/math/rand. I have opened a PR to address this issue upstream here https://github.com/envimate/nrpe/pull/3. I think in an ideal world I will get this fixed there and then after merge get that pulled into https://github.com/aperum/nrpe and then finally here.
I am opening this issue so that anyone else that runs into this problem whilst using the nrpe_exporter can benefit from what I have found out so far. Also because both nrpe projects have had no activity for over four years. So perhaps we will find that in order to get this fix into nrpe_exporter we need a new fork off aperum/nrpe.
Thanks for letting us know, please send in a PR once all the upstream has the fix.
@brian-brazil - this is not going very well :(
The owner of https://github.com/envimate/nrpe no longer wishes to maintain the repo and has tried to hand it off to someone, but it has been 22 days and no response.
Also I suspect that aperume may have noticed this conversation because some time in the last 22 days https://github.com/aperum/nrpe was deleted.
Would anyone from RobustPerception be interested in taking this on? I would love to but I am an absolute beginner at Go and I suspect I would not do a very good job.
That's not good news.
If @aperum upstream has disappeared, then we either need a new upstream or to copy the code in here. We don't actively use this code ourselves, but can provide code review.
Actually i wasn't aware that anyone is using this. I haven't used it in ages and so "garbage collected" it a few weeks ago. Github won't let me revive it so i'm sorry for the inconvenience.
Also I suspect that aperume may have noticed this conversation because some time in the last 22 days https://github.com/aperum/nrpe was deleted.
Sorry for intruding on this repo's issue, but does anyone have a link to a fork of that nrpe repo? I use it in a small go program and using go mods without a corresponding repo isn't terribly ideal :)