criterion.rs
criterion.rs copied to clipboard
Throughput the same as input size; line graph and throughput output are mutually exclusive
Currently, specifying a throughput, ostensibly for console report, also sets the input size. This is documented as "set the input size for this benchmark group", and intended behaviour. But it is unclear to me why this behaviour is correct. The TLDR is that I want to have throughput in a way that is not related to input size.
I am trying to use throughput to get useful console reports, and also generate an appropriate line graph. By line graph, I mean the line graph at the bottom of this page https://bheisler.github.io/criterion.rs/book/user_guide/benchmarking_with_inputs.html. I understand that the line graph measures the time as input size increases. My problem is that my input size is not equal to the throughput I want to measure.
My core benchmark function is here. I am benchmarking the performance of various vote counting methods. For throughput, I want to measure the number of votes counted per second.
I am varying the total number of seats to be won (as some methods take more time to count for more seats). I want the line graph to show the performance as the number of seats increases. In other words, the input size is the number of seats; the desired x-axis on the line graph is the number of seats.
Hopefully, the problem is clear by now. group.throughput
sets both the throughput and the input size, but in my case they are independent of each other.
Code
Currently this line is commented out:
group.throughput(Throughput::Elements(n_voters as u64));
The line graph I want will be plotted, with the number of seats as the x-axis. However there is no throughput reporting.
To enable throughput reporting, uncomment the line. Throughput will be reported, but the line graph will not be plotted.
The consequence of throughput being the same as input size, is that if they need to be different, then the desired line graph and desired throughput output are mutually exclusive.
To be clear, I agree that the line graph should not be plotted if the line is uncommented, because the input size is the same. I just don't think setting the throughput should also set the input size.
Workaround attempt 1 - treat throughput as input size only, ignore its output
I can use the number of seats as throughput, but I am more interested in how many votes it can count per second, not how many seats it can "process" per second, so the throughput console output is meaningless to me.
Workaround attempt 2 - make throughput and input size the same
I can swap n_seats and n_voters like this branch, to make the number of voters the input size and throughput. A line graph will be plotted with number of voters as the x-axis. Which isn't useless, but I also want to plot with number of seats on the x-axis, and there seems to be no way to do that while keeping the throughput report unchanged.
Conclusion
It's not such a big deal to lose throughput reporting, but I am curious how this issue can be resolved. Is there a philosophical need to tie throughput to input size? Aren't there many situations where the relevant throughput is independent of input size? Wouldn't it be better to split them into two settings?
This is more like a question and a feature request, rather than a bug report. I understand if there is insufficient resources to split up a setting that's usually the same, but at least this would be documented and hopefully save time for future users getting tripped up by this.