hyperfine icon indicating copy to clipboard operation
hyperfine copied to clipboard

Windows user & system times incorrect

Open kuviman opened this issue 4 years ago • 8 comments

Looks like user and kernel time on Windows is calculated for the shell process only, which is only used for spawning other processes and waiting for them, resulting in zero times.

I tried running processes directly instead of using the shell by using this and this way I do get relevant results:

/// Run a Windows shell command using `cmd.exe /C`
#[cfg(windows)]
fn run_shell_command(
    stdout: Stdio,
    stderr: Stdio,
    command: &str,
    shell: &str,
) -> io::Result<std::process::Child> {
    if command.is_empty() {
        Command::new(shell)
            .arg("/C")
            .arg(command)
            .stdin(Stdio::null())
            .stdout(stdout)
            .stderr(stderr)
            .spawn()
    } else {
        let mut parts = command.split_whitespace();
        Command::new(parts.next().unwrap())
            .args(parts)
            .stdin(Stdio::null())
            .stdout(stdout)
            .stderr(stderr)
            .spawn()
    }
}

kuviman avatar Feb 10 '21 05:02 kuviman

Thank you for reporting this! That should be fixed.

Related: https://stackoverflow.com/questions/36011572/how-to-obtain-handles-for-all-children-process-of-current-process-in-windows

sharkdp avatar Feb 15 '21 20:02 sharkdp

Also related: https://github.com/sharkdp/hyperfine/issues/336

sharkdp avatar Feb 15 '21 20:02 sharkdp

Hi! I'm a student using hyperfine for the research for my Master's thesis. I'm not quite sure my issue is related, since you said this problem should already be solved and my user and system times are actually above zero, but maybe this issue wasn't properly fixed? Or maybe I just have a very wrong understanding of what exactly do the user and system time show.

I'm currently benchmarking C++,C# and Java implementations of one of the problems from the Computer Language Benchmarks game. The user and system times are always very small compared to the wall clock time. One of the example results that confuse me is:

Time (mean ± σ):     49.903 s ±  0.082 s    [User: 6.9 ms, System: 16.3 ms]
Range (min … max):   49.815 s … 50.043 s    10 runs

which is for a sequential C++ program which should be spending most of its time dynamically allocating and deallocating memory. I've even done some profiling (using a sampling profiler) which confirms that this program spends at least 75% of its time in Windows kernel code. Why is the system time so low then? Shouldn't it be 3 orders of magnitude bigger? And the same for user time...

Pietruszek avatar May 08 '21 14:05 Pietruszek

Sorry for the misunderstanding. By "this should be fixed", I meant: we should fix this. It has not been fixed yet, otherwise the issue would be closed.

The user/system times you are seeing are probably the spawning times of cmd.exe and should be pretty much always the same, nevertheless which process you actually benchmark.

Any help on this would be very appreciated. The easiest "fix" or workaround might be to implement #336.

sharkdp avatar May 08 '21 20:05 sharkdp

Oh, that's unfortunate. I have no experience with making pull requests to open-source or with Rust, so I doubt I can be of much help... But isn't #336 almost done? It looks to me that cipriancraciun already did everything, but just forgot to make the final push.

Pietruszek avatar May 13 '21 11:05 Pietruszek

In hyperfine 1.13, we now have the option to disable the intermediate shell using -N/--shell=none. Could someone please verify that we get correct user/system timings on Windows when using this option?

A quick fix for this ticket might be to hide user/system times on Windows by default, unless the -N/--shell option is used.

sharkdp avatar Mar 05 '22 12:03 sharkdp

Could someone please verify that we get correct user/system timings on Windows when using this option?

I confirm it's now working fine :)

Without disabling the intermediate shell:

hyperfine '.\Cs_projects\nbody_cs\bin\x64\Release\netcoreapp3.1\nbody_cs.exe 50000000' '.\nbody_gpp.exe 50000000' 'java nbody 50000000'
Benchmark 1: .\Cs_projects\nbody_cs\bin\x64\Release\netcoreapp3.1\nbody_cs.exe 50000000
  Time (mean ± σ):      7.120 s ±  0.049 s    [User: 0.001 s, System: 0.011 s]
  Range (min … max):    7.069 s …  7.222 s    10 runs

Benchmark 2: .\nbody_gpp.exe 50000000
  Time (mean ± σ):      6.951 s ±  0.018 s    [User: 0.001 s, System: 0.006 s]
  Range (min … max):    6.928 s …  6.974 s    10 runs

Benchmark 3: java nbody 50000000
  Time (mean ± σ):      7.177 s ±  0.017 s    [User: 0.000 s, System: 0.006 s]
  Range (min … max):    7.156 s …  7.206 s    10 runs

Summary
  '.\nbody_gpp.exe 50000000' ran
    1.02 ± 0.01 times faster than '.\Cs_projects\nbody_cs\bin\x64\Release\netcoreapp3.1\nbody_cs.exe 50000000'
    1.03 ± 0.00 times faster than 'java nbody 50000000'

With -N:

hyperfine -N './Cs_projects/nbody_cs/bin/x64/Release/netcoreapp3.1/nbody_cs.exe 50000000' './nbody_gpp.exe 50000000' 'java nbody 50000000'
Benchmark 1: ./Cs_projects/nbody_cs/bin/x64/Release/netcoreapp3.1/nbody_cs.exe 50000000
  Time (mean ± σ):      7.087 s ±  0.022 s    [User: 7.053 s, System: 0.041 s]
  Range (min … max):    7.059 s …  7.126 s    10 runs

Benchmark 2: ./nbody_gpp.exe 50000000
  Time (mean ± σ):      6.951 s ±  0.012 s    [User: 6.941 s, System: 0.005 s]
  Range (min … max):    6.928 s …  6.968 s    10 runs

Benchmark 3: java nbody 50000000
  Time (mean ± σ):      7.166 s ±  0.011 s    [User: 7.147 s, System: 0.061 s]
  Range (min … max):    7.148 s …  7.180 s    10 runs

Summary
  './nbody_gpp.exe 50000000' ran
    1.02 ± 0.00 times faster than './Cs_projects/nbody_cs/bin/x64/Release/netcoreapp3.1/nbody_cs.exe 50000000'
    1.03 ± 0.00 times faster than 'java nbody 50000000'

Pietruszek avatar May 09 '22 14:05 Pietruszek

Okay. So an easy fix for now could be to disable the display of user and system times on Windows, unless --shell=none is used

sharkdp avatar May 16 '22 19:05 sharkdp