not-perf icon indicating copy to clipboard operation
not-perf copied to clipboard

How to run nperf with "cargo bench"?

Open brainstorm opened this issue 2 years ago • 5 comments

I would like to profile benchmarks running instead of a target bin file directly. What would be a correct CLI syntax for that? I've tried the following unsuccessfully (the output datafile is not generated):

% cargo run record -P $(cargo bench) -w -o datafile
   Compiling htsget-benchmarks v0.1.0 (/Users/rvalls/dev/umccr/htsget-rs/htsget-benchmarks)
   Finished bench [optimized + debuginfo] target(s) in 7.37s
   Running benches/refserver_benchmarks.rs (/Users/rvalls/dev/umccr/htsget-rs/target/release/deps/refserver_benchmarks-fecd9ace2aeca2e9)
     Running benches/request_benchmarks.rs (/Users/rvalls/dev/umccr/htsget-rs/target/release/deps/request_benchmarks-f78a77d95f70b5d1)
     Running benches/search_benchmarks.rs (/Users/rvalls/dev/umccr/htsget-rs/target/release/deps/search_benchmarks-407b85d5c1d86b5d)
Benchmarking Queries/[LIGHT] Bam query all
Benchmarking Queries/[LIGHT] Bam query all: Warming up for 3.0000 s
Benchmarking Queries/[LIGHT] Bam query all: Collecting 50 samples in estimated 30.048 s (487k iterations)
Benchmarking Queries/[LIGHT] Bam query all: Analyzing
Benchmarking Queries/[LIGHT] Bam query specific
Benchmarking Queries/[LIGHT] Bam query specific: Warming up for 3.0000 s
Benchmarking Queries/[LIGHT] Bam query specific: Collecting 50 samples in estimated 30.260 s (66k iterations)
Benchmarking Queries/[LIGHT] Bam query specific: Analyzing
Benchmarking Queries/[LIGHT] Bam query header
Benchmarking Queries/[LIGHT] Bam query header: Warming up for 3.0000 s
Benchmarking Queries/[LIGHT] Bam query header: Collecting 50 samples in estimated 30.096 s (282k iterations)
Benchmarking Queries/[LIGHT] Bam query header: Analyzing
error: a bin target must be available for `cargo run`

/cc @mmalenic

brainstorm avatar May 19 '23 04:05 brainstorm

nperf record -P refserver_benchmarks-fecd9ace2aeca2e9 -w

The -P argument needs a name of the executable that it'll search for in the process list. In your case three executables are being run by cargo bench, so you'll need to run three nperf records.

koute avatar May 19 '23 09:05 koute

Thanks @koute, I thought about doing just that, but I didn't want to clash with other previously generated criterion-rs binaries. Would you be open to have some external profiler hooking capabilities for criterion-rs in not-perf?: https://bheisler.github.io/criterion.rs/book/user_guide/profiling.html#implementing-in-process-profiling-hooks

/cc @mmalenic

brainstorm avatar May 22 '23 11:05 brainstorm

Would you be open to have some external profiler hooking capabilities for criterion-rs in not-perf?: https://bheisler.github.io/criterion.rs/book/user_guide/profiling.html#implementing-in-process-profiling-hooks

What exactly would you like to do here? Do you mean having a crate that would pull not-perf in and implement that trait?

koute avatar May 22 '23 11:05 koute

What exactly would you like to do here? Do you mean having a crate that would pull not-perf in and implement that trait?

Precisely, something like https://www.jibbow.com/posts/criterion-flamegraphs/ but with not-perf, because the trace files generated by pprof are way too big to handle in GHA workers (~2-4GB on last test run)

brainstorm avatar May 23 '23 04:05 brainstorm

What exactly would you like to do here? Do you mean having a crate that would pull not-perf in and implement that trait?

Precisely, something like https://www.jibbow.com/posts/criterion-flamegraphs/ but with not-perf, because the trace files generated by pprof are way too big to handle in GHA workers (~2-4GB on last test run)

Well, I'd be fine with that. We'd probably want to add an new crate that'd expose an implementation of that trait and use not-perf's internals to do its work.

koute avatar May 29 '23 10:05 koute