node_exporter Refactor perf collector to better handle failures and warn on opens

This changes the perf collector to handle partial failures even better by checking the HasProfilers method on the various profilers interfaces. The configuration of the profiler interfaces have changed so that an optional bitmask of profilers can be set. In theory this could make for much finer control of which profilers are enabled, but that is beyond the scope of this change. The profiler interfaces are also changed so that in theory object pools could be used to reduce memory allocations.

Tested running locally and then:

curl localhost:9100/metrics | grep perf | grep 'cpu="0"'
node_perf_branch_instructions_total{cpu="0"} 8.20418e+06
node_perf_branch_misses_total{cpu="0"} 162365
node_perf_cache_bpu_read_hits_total{cpu="0"} 9.985934e+06
node_perf_cache_bpu_read_misses_total{cpu="0"} 221845
node_perf_cache_l1_instr_read_misses_total{cpu="0"} 181514
node_perf_cache_l1d_read_hits_total{cpu="0"} 1.4483013e+07
node_perf_cache_l1d_read_misses_total{cpu="0"} 685931
node_perf_cache_misses_total{cpu="0"} 141926
node_perf_cache_refs_total{cpu="0"} 2.173699e+06
node_perf_cache_tlb_instr_read_hits_total{cpu="0"} 849
node_perf_cache_tlb_instr_read_misses_total{cpu="0"} 603
node_perf_context_switches_total{cpu="0"} 3930
node_perf_cpu_migrations_total{cpu="0"} 3
node_perf_cpucycles_total{cpu="0"} 1.87280477e+08
node_perf_instructions_total{cpu="0"} 4.4194532e+07
node_perf_major_faults_total{cpu="0"} 0
node_perf_minor_faults_total{cpu="0"} 1352
node_perf_page_faults_total{cpu="0"} 1352

Signed-off-by: Daniel Hodges [email protected]

Nov 01 '21 00:11 hodgesds

After this diff I'd like to add stalled frontend/backend cycles as it can be used to give a better picture of instruction performance. Here's a post that has some background info as well as some of the ways that the perf subsystem exposes those metrics. They are already available in the perf-utils library, but haven't been added yet.

Nov 01 '21 00:11 hodgesds

Is it possible to just attempt opening the profiler handles and fail with a meaningful error? Not sure how I feel about checking open files. All(?) other collectors just attempt opening stuff and fail if they can't.

Nov 01 '21 10:11 discordianfish

Is it possible to just attempt opening the profiler handles and fail with a meaningful error?

Potentially, the problem is some of the profilers could be unavailable due to kernel configurations or even physical hardware (ex: CPU doesn't implement all hardware counters). The other problem is there is a large combination of different profiler handle types so having one flag to enable them all is a little problematic.

Not sure how I feel about checking open files. All(?) other collectors just attempt opening stuff and fail if they can't.

That's reasonable if setting up the profiler handles fail fast. I was almost thinking that check would be better as a info check during start up to let users know if they are using default configs that may need tuned (IIRC RLIMIT_NOFILE is 1024).

Nov 01 '21 13:11 hodgesds

Potentially, the problem is some of the profilers could be unavailable due to kernel configurations or even physical hardware (ex: CPU doesn't implement all hardware counters).

But then the call can fail and we return a meaningful error to the user, right?

The other problem is there is a large combination of different profiler handle types so having one flag to enable them all is a little problematic But this is orthogonal to the problem here, right? Maybe we should have a flag for each of them?

That's reasonable if setting up the profiler handles fail fast. Why wouldn't they fail fast?

All in all, I might still miss something but I don't understand why we can't just open the handlers and return an error that is meaningful if it fails (like 'out of fds' vs 'cpu doesn't support hw counters' vs 'perf not enabled in kernel'). That'd be ideal I think?

Nov 03 '21 10:11 discordianfish

All in all, I might still miss something but I don't understand why we can't just open the handlers and return an error that is meaningful if it fails (like 'out of fds' vs 'cpu doesn't support hw counters' vs 'perf not enabled in kernel'). That'd be ideal I think?

Here's a simple example, if running in a virtual you may not have access to the underlying hardware counters, but still may have access to software counters from the kernel. We can make it fail fast, but that makes the perf collector basically worthless for use inside a VM. Alternatively, it could do something like a best effort of setting up all the relevant profilers and fail if none are available. The other tricky part with errors return from perf_event_open are most likely EOPNOTSUPP, EINVAL and EACCES for failures. I'm guessing handling EOPNOTSUPP is the tricky part to get right as that is returned if the desired counters are available.

Nov 10 '21 01:11 hodgesds

Alternatively, it could do something like a best effort of setting up all the relevant profilers and fail if none are available

Yeah, that, plus logging the ones that can't be setup, sounds reasonable. I'd get rid of the file handlers check here though and return an error if out of fds.

@SuperQ wdyt?

Nov 12 '21 10:11 discordianfish

I think this needs rebasing to pick up the macos fix.

Nov 15 '21 09:11 discordianfish

node_exporter node_exporter copied to clipboard

Refactor perf collector to better handle failures and warn on opens

node_exporter
node_exporter copied to clipboard