node_exporter icon indicating copy to clipboard operation
node_exporter copied to clipboard

Index out of range error in parseMemInfo for linux

Open kuznetsovpn opened this issue 2 years ago • 3 comments

Host operating system: output of uname -a

Linux version 5.4.0-153-generic (buildd@bos03-amd64-008) (gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)) #170-Ubuntu SMP Fri Jun 16 13:43:31 UTC 2023

node_exporter version: output of node_exporter --version

node_exporter, version 1.6.0

node_exporter command line flags

/bin/sh -c /usr/local/bin/node_exporter --web.listen-address=:42000 --collector.systemd --collector.processes

node_exporter log output

panic: runtime error: index out of range [1] with length 1

goroutine 218975 [running]:
github.com/prometheus/node_exporter/collector.parseMemInfo({0xbfb740, 0xc000095460})
  /app/collector/meminfo_linux.go:56 +0x2b9
github.com/prometheus/node_exporter/collector.(*meminfoCollector).getMemInfo(0x0)
  /app/collector/meminfo_linux.go:40 +0xf0
github.com/prometheus/node_exporter/collector.(*meminfoCollector).Update(0xc000215150, 0xc000319d40)
  /app/collector/meminfo.go:50 +0x3e
github.com/prometheus/node_exporter/collector.execute({0xb1390b, 0xc00041a1e0}, {0xbfa9a0, 0xc000215150}, 0xc00041a240, {0xbfa3c0, 0xc000031c40})
  /app/collector/collector.go:161 +0x9c
github.com/prometheus/node_exporter/collector.NodeCollector.Collect.func1({0xb1390b, 0xc00030e790}, {0xbfa9a0, 0xc000215150})
  /app/collector/collector.go:152 +0x3d
created by github.com/prometheus/node_exporter/collector.NodeCollector.Collect
  /app/collector/collector.go:151 +0xd5

Are you running node_exporter in Docker?

no

What did you do that produced an error?

My setup is several lxc containers with node exporters. One day after host machine rebooting node exporters started to "crash" with these errors in logs from time to time.

What did you expect to see?

I found a similar "issue" - https://github.com/prometheus/node_exporter/pull/1671 . It says it happened because of empy lines in /proc/meminfo. Perhaps in my case instead of empty line I get incorrect line without key or value and "parts" length in code of parseMemInfo is 1 what caused the error bacuse there is parts[1] in the code.

I dont really understand why it happened in my system, i couldn't catch "incorrect lines" in /proc/meminfo, maybe because it happening randomly.

Perhaps it should be fixed with additional validation like in https://github.com/prometheus/node_exporter/pull/1671 but for parts length equal to 1. Or perhaps you just could give me some advise what is a better way to deal with my problem because it seems like it is maybe more my local problem with OS and not problem with exporters.

kuznetsovpn avatar Jul 18 '23 12:07 kuznetsovpn

I think we found and fixed these parsing issues in the prometheus/procfs version of the /proc/meminfo parser. But we never refactored the collector here to use the new parser.

SuperQ avatar Jul 18 '23 13:07 SuperQ

Thank you for the reply!

Yes, I see the validation in procfs - https://github.com/prometheus/procfs/blob/master/meminfo.go#L167 .

Will the issue be helpfull maybe for planning the refactoring you mentioned or should I just close it?

kuznetsovpn avatar Jul 19 '23 11:07 kuznetsovpn

No, let's keep this issue open to track the fix implementation here.

SuperQ avatar Jul 19 '23 12:07 SuperQ