mountstats collector spams logs

Open pahaeanx opened this issue 1 year ago • 1 comments

Seems the NFS collector is broken in 1.9.0 and keeps spamming the logs with was collected before with the same name and label values messages.

Unfortunately I don't have access to the system, but I managed to get a few lines of log output:

label:{name:\"mountaddr\" value:\"10.200.255.131\"} label:{name:\"protocol\" value:\"tcp\"} counter:{value:1}} was collected before with the same name and label values\n* collected metric \"node_mountstats_nfs_transport_connect_total\" { label:
{name:\"export\" value:\"10.200.255.131:/some_share\"}
label:{name:\"mountaddr\" value:\"10.200.255.131\"} label:{name:\"protocol\" value:\"tcp\"} counter:{value:749}} was collected before with the same name and label values\n* collected metric \"node_mountstats_nfs_transport_idle_time_seconds\" { label:
{name:\"export\" value:\"10.200.255.131:/some_share\"}
label:{name:\"mountaddr\" value:\"10.200.255.131\"} label:{name:\"protocol\" value:\"tcp\"} gauge:{value:28}} was collected before with the same name and label values\n* collected metric \"node_mountstats_nfs_transport_sends_total\" { label:
{name:\"export\" value:\"10.200.255.131:/some_share\"}
label:{name:\"mountaddr\" value:\"10.200.255.131\"} label:{name:\"protocol\" value:\"tcp\"} counter:{value:1.401758e+06}} was collected before with the same name and label values\n* collected metric \"node_mountstats_nfs_transport_receives_total\" { label:
{name:\"export\" value:\"10.200.255.131:/some_share\"}
label:{name:\"mountaddr\" value:\"10.200.255.131\"} label:{name:\"protocol\" value:\"tcp\"} counter:{value:1.401758e+06}} was collected before with the same name and label values\n* collected metric \"node_mountstats_nfs_transport_bad_transaction_ids_total\" { label:
{name:\"export\" value:\"10.200.255.131:/some_share\"}
label:{name:\"mountaddr\" value:\"10.200.255.131\"} label:{name:\"protocol\" value:\"tcp\"} counter:{value:0}} was collected before with the same name and label values\n* collected metric \"node_mountstats_nfs_transport_backlog_queue_total\" { label:
{name:\"export\" value:\"10.200.255.131:/some_share\"}
label:{name:\"mountaddr\" value:\"10.200.255.131\"} label:{name:\"protocol\" value:\"tcp\"} counter:{value:0}} was collected before with the same name and label values\n* collected metric \"node_mountstats_nfs_transport_maximum_rpc_slots\" { label:
{name:\"export\" value:\"10.200.255.131:/some_share\"}
label:{name:\"mountaddr\" value:\"10.200.255.131\"} label:{name:\"protocol\" value:\"tcp\"} gauge:{value:128}} was collected before with the same name and label values\n* collected metric \"node_mountstats_nfs_transport_sending_queue_total\" { label:
{name:\"export\" value:\"10.200.255.131:/some_share\"}
label:{name:\"mountaddr\" value:\"10.200.255.131\"} label:{name:\"protocol\" value:\"tcp\"} counter:{value:0}} was collected before with the same name and label values\n* collected metric \"node_mountstats_nfs_transport_pending_queue_total\" { label:
{name:\"export\" value:\"10.200.255.131:/some_share\"}

Seems this only affects NFS related metrics. This also repeats for every NFS mount, which on these machines are a lot, hence the spam.

I know that

1.8.2 worked flawlessly for months
No changes on the system and/or mounts were made
It (apparently) started right after rolling out 1.9.0

So this is definitely related to 1.9.0.

Affected systems run Debian 11 with a version of the 5.15 kernel. Please let me know if you need more information, I'll try to get access to one of the systems.

Feb 20 '25 07:02 pahaeanx

That looks like the mountstats collector, not the nfs collector. There have been no changes to this collector since v1.8.2.

Feb 20 '25 07:02 SuperQ