hyperfine
hyperfine copied to clipboard
Windows user & system times incorrect
Looks like user and kernel time on Windows is calculated for the shell process only, which is only used for spawning other processes and waiting for them, resulting in zero times.
I tried running processes directly instead of using the shell by using this and this way I do get relevant results:
/// Run a Windows shell command using `cmd.exe /C`
#[cfg(windows)]
fn run_shell_command(
stdout: Stdio,
stderr: Stdio,
command: &str,
shell: &str,
) -> io::Result<std::process::Child> {
if command.is_empty() {
Command::new(shell)
.arg("/C")
.arg(command)
.stdin(Stdio::null())
.stdout(stdout)
.stderr(stderr)
.spawn()
} else {
let mut parts = command.split_whitespace();
Command::new(parts.next().unwrap())
.args(parts)
.stdin(Stdio::null())
.stdout(stdout)
.stderr(stderr)
.spawn()
}
}
Thank you for reporting this! That should be fixed.
Related: https://stackoverflow.com/questions/36011572/how-to-obtain-handles-for-all-children-process-of-current-process-in-windows
Also related: https://github.com/sharkdp/hyperfine/issues/336
Hi! I'm a student using hyperfine for the research for my Master's thesis. I'm not quite sure my issue is related, since you said this problem should already be solved and my user and system times are actually above zero, but maybe this issue wasn't properly fixed? Or maybe I just have a very wrong understanding of what exactly do the user and system time show.
I'm currently benchmarking C++,C# and Java implementations of one of the problems from the Computer Language Benchmarks game. The user and system times are always very small compared to the wall clock time. One of the example results that confuse me is:
Time (mean ± σ): 49.903 s ± 0.082 s [User: 6.9 ms, System: 16.3 ms]
Range (min … max): 49.815 s … 50.043 s 10 runs
which is for a sequential C++ program which should be spending most of its time dynamically allocating and deallocating memory. I've even done some profiling (using a sampling profiler) which confirms that this program spends at least 75% of its time in Windows kernel code. Why is the system time so low then? Shouldn't it be 3 orders of magnitude bigger? And the same for user time...
Sorry for the misunderstanding. By "this should be fixed", I meant: we should fix this. It has not been fixed yet, otherwise the issue would be closed.
The user/system times you are seeing are probably the spawning times of cmd.exe and should be pretty much always the same, nevertheless which process you actually benchmark.
Any help on this would be very appreciated. The easiest "fix" or workaround might be to implement #336.
Oh, that's unfortunate. I have no experience with making pull requests to open-source or with Rust, so I doubt I can be of much help... But isn't #336 almost done? It looks to me that cipriancraciun already did everything, but just forgot to make the final push.
In hyperfine 1.13, we now have the option to disable the intermediate shell using -N/--shell=none. Could someone please verify that we get correct user/system timings on Windows when using this option?
A quick fix for this ticket might be to hide user/system times on Windows by default, unless the -N/--shell option is used.
Could someone please verify that we get correct user/system timings on Windows when using this option?
I confirm it's now working fine :)
Without disabling the intermediate shell:
hyperfine '.\Cs_projects\nbody_cs\bin\x64\Release\netcoreapp3.1\nbody_cs.exe 50000000' '.\nbody_gpp.exe 50000000' 'java nbody 50000000'
Benchmark 1: .\Cs_projects\nbody_cs\bin\x64\Release\netcoreapp3.1\nbody_cs.exe 50000000
Time (mean ± σ): 7.120 s ± 0.049 s [User: 0.001 s, System: 0.011 s]
Range (min … max): 7.069 s … 7.222 s 10 runs
Benchmark 2: .\nbody_gpp.exe 50000000
Time (mean ± σ): 6.951 s ± 0.018 s [User: 0.001 s, System: 0.006 s]
Range (min … max): 6.928 s … 6.974 s 10 runs
Benchmark 3: java nbody 50000000
Time (mean ± σ): 7.177 s ± 0.017 s [User: 0.000 s, System: 0.006 s]
Range (min … max): 7.156 s … 7.206 s 10 runs
Summary
'.\nbody_gpp.exe 50000000' ran
1.02 ± 0.01 times faster than '.\Cs_projects\nbody_cs\bin\x64\Release\netcoreapp3.1\nbody_cs.exe 50000000'
1.03 ± 0.00 times faster than 'java nbody 50000000'
With -N:
hyperfine -N './Cs_projects/nbody_cs/bin/x64/Release/netcoreapp3.1/nbody_cs.exe 50000000' './nbody_gpp.exe 50000000' 'java nbody 50000000'
Benchmark 1: ./Cs_projects/nbody_cs/bin/x64/Release/netcoreapp3.1/nbody_cs.exe 50000000
Time (mean ± σ): 7.087 s ± 0.022 s [User: 7.053 s, System: 0.041 s]
Range (min … max): 7.059 s … 7.126 s 10 runs
Benchmark 2: ./nbody_gpp.exe 50000000
Time (mean ± σ): 6.951 s ± 0.012 s [User: 6.941 s, System: 0.005 s]
Range (min … max): 6.928 s … 6.968 s 10 runs
Benchmark 3: java nbody 50000000
Time (mean ± σ): 7.166 s ± 0.011 s [User: 7.147 s, System: 0.061 s]
Range (min … max): 7.148 s … 7.180 s 10 runs
Summary
'./nbody_gpp.exe 50000000' ran
1.02 ± 0.00 times faster than './Cs_projects/nbody_cs/bin/x64/Release/netcoreapp3.1/nbody_cs.exe 50000000'
1.03 ± 0.00 times faster than 'java nbody 50000000'
Okay. So an easy fix for now could be to disable the display of user and system times on Windows, unless --shell=none is used