node_exporter icon indicating copy to clipboard operation
node_exporter copied to clipboard

textfile collector ignores file and debug logging

Open Endemoniada opened this issue 2 years ago • 8 comments

Host operating system: output of uname -a

Host: Linux archivist 5.10.83-1-lts #1 SMP Wed, 01 Dec 2021 12:35:24 +0000 x86_64 GNU/Linux Container: Linux 832bd12de6a8 5.10.83-1-lts #1 SMP Wed, 01 Dec 2021 12:35:24 +0000 x86_64 GNU/Linux

node_exporter version: output of node_exporter --version

$ node_exporter --version
node_exporter, version 1.3.1 (branch: HEAD, revision: a2321e7b940ddcff26873612bccdf7cd4c42b6b6)
  build user:       root@243aafa5525c
  build date:       20211205-11:09:49
  go version:       go1.17.3
  platform:         linux/amd64

node_exporter command line flags

/bin/node_exporter --path.rootfs=/host --collector.textfile.directory="/var/lib/node_exporter/textfile_collector" --log.level=debug

Are you running node_exporter in Docker?

yes

What did you do that produced an error?

specified textfile.directory path and specified log.level debug in startup options

What did you expect to see?

metric data from 'smart_metrics.prom' in textfile.directory path in the Prometheus data, as well as any kind of debug output regarding potential errors or problems.

What did you see instead?

No data, no warnings, no INFO or DEBUG output regarding this file or directory at all.

docker container log output:

ts=2022-01-31T07:35:35.287Z caller=node_exporter.go:182 level=info msg="Starting node_exporter" version="(version=1.3.1, branch=HEAD, revision=a2321e7b940ddcff26873612bccdf7cd4c42b6b6)"
ts=2022-01-31T07:35:35.287Z caller=node_exporter.go:183 level=info msg="Build context" build_context="(go=go1.17.3, user=root@243aafa5525c, date=20211205-11:09:49)"
ts=2022-01-31T07:35:35.288Z caller=filesystem_common.go:111 level=info collector=filesystem msg="Parsed flag --collector.filesystem.mount-points-exclude" flag=^/(dev|proc|run/credentials/.+|sys|var/lib/docker/.+)($|/)
ts=2022-01-31T07:35:35.288Z caller=filesystem_common.go:113 level=info collector=filesystem msg="Parsed flag --collector.filesystem.fs-types-exclude" flag=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:108 level=info msg="Enabled collectors"
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=arp
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=bcache
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=bonding
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=btrfs
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=conntrack
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=cpu
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=cpufreq
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=diskstats
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=dmi
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=edac
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=entropy
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=fibrechannel
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=filefd
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=filesystem
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=hwmon
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=infiniband
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=ipvs
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=loadavg
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=mdadm
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=meminfo
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=netclass
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=netdev
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=netstat
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=nfs
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=nfsd
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=nvme
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=os
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=powersupplyclass
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=pressure
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=rapl
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=schedstat
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=sockstat
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=softnet
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=stat
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=tapestats
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=textfile
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=thermal_zone
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=time
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=timex
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=udp_queues
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=uname
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=vmstat
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=xfs
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:115 level=info collector=zfs
ts=2022-01-31T07:35:35.288Z caller=node_exporter.go:199 level=info msg="Listening on" address=:9100
ts=2022-01-31T07:35:35.294Z caller=tls_config.go:195 level=info msg="TLS is disabled." http2=false

Note how there's no mention of the textfile.collector or log.level parameters, it simply lists the textfile collector as enabled and nothing else.

This is the service block for the container in docker-compose:

  node_exporter:
    image: quay.io/prometheus/node-exporter:latest
    container_name: node_exporter
    command:
      - '--path.rootfs=/host --collector.textfile.directory="/var/lib/node_exporter/textfile_collector" --log.level=debug'
    pid: host
    restart: unless-stopped
    volumes:
      - '/:/host:ro,rslave'
      - '/opt/node_exporter/textfile_collector:/var/lib/node_exporter/textfile_collector'

I've made sure the .prom file is present in the container, and is readable by the same user running the node_exporter process. The .prom file is generated by the "smartmon.sh" script that is seemingly recommended and used by many people, so I don't expect there to be errors in the file itself. If there are, I would expect to see some kind of error from node_exporter, especially since I specified debug-level log output.

Endemoniada avatar Jan 31 '22 09:01 Endemoniada

Maybe a quoting issue? Can you try --collector.textfile.directory=/var/lib/node_exporter/textfile_collector instead? Maybe compose adds these a literal. Either way, I don't think this is an problem. We might want to consider logging the result of https://github.com/prometheus/node_exporter/blob/master/collector/textfile.go#L196 in debug log level. PR for that would be welcome.

discordianfish avatar Jan 31 '22 11:01 discordianfish

Yes, definitely looks like a quoting issue to me. Docker is very likely passing in the path as literal "/var/lib/node_exporter/textfile_collector"

@discordianfish We've been putting those flag parsing lines in as Info so they are printed once at startup, like filesystem_common.go:111.

SuperQ avatar Jan 31 '22 13:01 SuperQ

Maybe a quoting issue? Can you try --collector.textfile.directory=/var/lib/node_exporter/textfile_collector instead?

Tried it, no difference whatsoever. I'm not getting anything in the process output, nothing in prometheus, no indication whether it's even working or not working. I feel like I should see something, especially if it's not working at all?

This is the current running command:

$ ps auxww|grep node
648205 nobody    0:13 /bin/node_exporter --path.rootfs=/host --collector.textfile.directory=/var/lib/node_exporter/textfile_collector --log.level=debug

Endemoniada avatar Jan 31 '22 13:01 Endemoniada

I have the same issue. I can start node_exporter 1.3.1 with ONLY the log level option and it ignores debug. It works as expected with error, warn, info, or undeclared. Setting it to an invalid option properly errors out with "unrecognized log level". Setting it to debug allows the node_exporter to start, but log level is not set.

mrpaulmanley avatar Feb 22 '22 17:02 mrpaulmanley

Oh I think the issue is simply that we don't add the --path.rootfs prefix to --collector.textfile.directory.. @SuperQ Should we? I see how this is surprising..

@mrpaulmanley How do you know it ignores the debug level? During the startup, there are no debug messages expected. Once you scrape it you should see debug messages.

discordianfish avatar Feb 23 '22 13:02 discordianfish

@discordianfish Oh, hmm, I'm not sure adding --path.rootfs makes sense. Users may have the textfile dir not directly on the rootfs and have something like a -v /some_path:/textfile_colector in their Docker command.

SuperQ avatar Feb 23 '22 15:02 SuperQ

@SuperQ yes, that's why I'm also not sure we should do it. The problem is just that it's not clear to users where --path.rootfs is prefixed and where not, as apparent by this report.

discordianfish avatar Feb 23 '22 15:02 discordianfish

Oh I think the issue is simply that we don't add the --path.rootfs prefix to --collector.textfile.directory.. @SuperQ Should we? I see how this is surprising..

@mrpaulmanley How do you know it ignores the debug level? During the startup, there are no debug messages expected. Once you scrape it you should see debug messages.

It says the log level is info in the output and the logging during scrape is the same. I'll get some output for you once I have access to equipment again.

mrpaulmanley avatar Feb 23 '22 16:02 mrpaulmanley

Hello,

I was running into a similar issue with picking up data from smartmon.sh using the textfile collector. Part of the issue seemed to be using a one-liner for the commands. In the debug it gave a error like this:

ts=2024-04-11T05:54:25.133Z caller=filesystem_linux.go:136 level=debug collector=filesystem msg="Error on statfs() system call" rootfs="/host --collector.textfile --collector.textfile.directory=/var/lib/node_exporter/textfile_collector" err="no such file or directory"

After separating out the commands it worked. Here is the docker compose snippet for node_exporter that finally ran correctly for me:

  node_exporter:
    image: quay.io/prometheus/node-exporter:v1.7.0
    container_name: node_exporter
    ports:
      - 9100:9100
    command:
      - '--path.rootfs=/host'
      #- '--log.level=debug'
      #- '--collector.textfile'
      - '--collector.textfile.directory=/var/lib/node_exporter/textfile_collector'
    pid: host
    restart: unless-stopped
    volumes:
      - '/:/host:ro,rslave'
      - '/var/lib/node_exporter/textfile_collector:/var/lib/node_exporter/textfile_collector'

ULOCKFIXED avatar Apr 11 '24 06:04 ULOCKFIXED