hyperfine seperate parameters per command

given two distinct commands cmd1 and cmd2, which take the parameters A and B respectively (which can have different values). i want to be able to compare the two commands with hyperfine.

without parameters i would do hyperfine "cmd1 A" "cmd2 B"

but when i add parameters: hyperfine -L A A_values -L B B_values "cmd1 {A}" "cmd2 {B}" i get a product of A and B for both commands, meaning that cmd2 is measured with identical parameters multiple times per value of B.

is there a way to only benchmark one instance of each command (after substituting the parameters)? if not i would like to request that as a feature.

Apr 15 '23 11:04 nivkner

Thank you for this feature request.

We considered this mode of combining parameters ("zip" as opposed to "cross product") back then when we designed the "parameter matrix" feature, see https://github.com/sharkdp/hyperfine/issues/253#issuecomment-573109473

I argued that it would be a less common use case, but to be honest, I didn't think of your scenario where multiple commands take parameters.

I don't think it would be difficult to implement this, but the challenge is to design the command-line interface for this, ideally in a backwards-compatible way. I don't think that adding --zip-parameter, as suggested in the linked ticket, is a good option. Because what would --parameter-scan … --zip-parameter … mean? or --parameter-scan … --parameter-scan … --zip-parameter …?

Another choice could be to have something like a --parameter-combination={product,zipped} option which would default to product? It's not the most flexible option, but maybe sufficient?

What other choices do we have, assuming we want to implement this? Could we even auto-detect those cases somehow (because nobody wants to run cmd1 {A} for multiple values of B)?

partially related: #575

Apr 17 '23 19:04 sharkdp

i do think auto-detection makes sense, since adding an option might complicate the interface and as you said, could produce ambiguity.

i suggest the following:

for every command identify what parameters are unused. then when substituting, replace those parameters with the singleton NULL (like in SQL, might be called differently). that way a command which uses A but not B only executes once per value of A as a product of A and NULL

in outputs which don't show the parameters, like markdown, this would look exactly like executing the command once. in outputs that do, like CSV, you can place a NULL, - or a blank space in place of the parameter (whatever works as an indicator that there is no value there).

for example with A={1,2},B={1,2} the output of the CSV might look like this

command,mean,stddev,median,user,system,min,max,parameter_A,parameter_B
cmd1 1, ... ,1,NULL
cmd1 2, ... ,2,NULL
cmd2 1, ... ,NULL,1
cmd2 2, ... ,NULL,2

EDIT: incidentally, detecting unused parameters is also needed for #600 so that synergizes nicely.

Apr 18 '23 10:04 nivkner

hyperfine hyperfine copied to clipboard

seperate parameters per command

hyperfine
hyperfine copied to clipboard