node_exporter
                                
                                 node_exporter copied to clipboard
                                
                                    node_exporter copied to clipboard
                            
                            
                            
                        Add hw_counters metrics for infiniband device.
Add hw_counters metrics for infiniband device.
ref: https://github.com/prometheus/procfs/pull/549
Beside these, maybe consider combining metrics that can be summed up (like multiple types of errors etc) into one with different labels. See https://prometheus.io/docs/practices/naming/
Hi, how is progress on this? We would love to have these counters :)
thanks @discordianfish. Please re-check
This would be really useful for us, any update?
@SuperQ Please recheck it, thanks.
Interested in this PR too.
Ping again
The general idea also applies to the other metrics. We need to carefully think about the names since changing them in the future is a breaking change that will need to be enabled by adding a flag etc etc. So sorry if that feels pedentic but we need to make sure we get these names right.
Also, see my other comment here: https://github.com/prometheus/node_exporter/pull/2827#issuecomment-1770448902
PS: Sorry for not being responsive. I don't get paid to work on this so my availability is limited :-/
The general idea also applies to the other metrics. We need to carefully think about the names since changing them in the future is a breaking change that will need to be enabled by adding a flag etc etc. So sorry if that feels pedentic but we need to make sure we get these names right.
Also, see my other comment here: #2827 (comment)
PS: Sorry for not being responsive. I don't get paid to work on this so my availability is limited :-/
Thank you for your review. I will fix these issues as soon as possible. 🙏
@discordianfish Please re-check, thanks
https://github.com/prometheus/node_exporter/pull/2827/commits/8b5605040ae8f57566211fd7e557dc12363f5965
Better, thanks! Still, can you consider combining some of these metrics using the same name and different labels?
Better, thanks! Still, can you consider combining some of these metrics using the same name and different labels?
Thanks for your review. Do you have any suggestions to combine some of these indicators using the same name and different labels? @discordianfish
@dongjiang1989 That depends on the metrics, I don't really have the resources to go through all these and suggest something. See the linked best practices, particular this quote:
As a rule of thumb, either the sum() or the avg() over all dimensions of a given metric should be meaningful (though not necessarily useful). If it is not meaningful, split the data up into multiple metrics. For example, having the capacity of various queues in one metric is good, while mixing the capacity of a queue with the current number of elements in the queue is not.
One candidate might be the error metrics where you could have one and distinguish between them via some sort of error type label etc
The hw_counters is still important metrics, Hope to continue updating!