sleef
sleef copied to clipboard
Micro benchmarks
Use the google benchmark library [1] to write micro-benchmarks for the vector math routines.
Each benchmark should be invoking the function with random values in the input range.
[1] https://github.com/google/benchmark
I am now considering to overhaul the benchmarking tools. My plan is to automate graphing with gnuplot first, and then introduce Google's framework. In order to process data, I am planning to use java, since I am used to it.
I think we should do this after the transition from Makefile to cmake, as all the components will be easier to integrate.
I think that it would be better to implement this in the following order:
- write the micro benchmarks with the google benchmark framework;
- produce the graphic visualization.
I am happy for you to use Java+gnuplot, but I think that it would be better to use python+mathplotlib, as it might be easier to maintain and run. It will require less configuration than setting up a Java VM, and I think it will be easier to deploy on Travis.
Do we need to deploy it on Travis? You know, the benchmark results cannot be trusted since Travis is on a cloud. The problem with python is that I need time to learn those things. I have lots of experience with Java, and some experience with gnuplot, so it would be much easier for me.
Then, Java + gnuplot is, you are right, we cannot run benchmarks on the cloud.
I still think you should first use the micro benchmark system of google and then plot the data, because the output of Google benchmark might be different from what you expect in the plotting scripts. You might need to rework the scripts if that is going to be the case. Also, Google benchmark has some report facilities (see [1] for example), which might be used instead of the plotting tool.
I recommend you to leave the scripts for plotting as the last (optional) part of this work.
[1] http://1.bp.blogspot.com/-wk7hsdYodo8/UtS75FZag6I/AAAAAAAAAks/cQFiCXPbtwk/s1600/image00.png
Could you explain a little bit about how you are planning to use the output data? I have been always thought this work as a part of writing our paper. From that perspective, drawing graph is essential.
Getting numbers is very important. Using google framework should guarantee that we have reliable evaluations. If the aim of setting up this is only to get the numbers in the paper, I think we don't need to store any script in here for that.
My goal here was to make sure that the numbers we were getting where reliable, and I believe that those we could get with google micro benchmarks framework have such property. They would be reliable not only for us, but also for people reading these numbers are they are produce via a standard tool.
Also, using google micro benchmarks would make easier for other people to verify our claims on their own machines.
For the paper, maybe we could store the scripts that generate the graphics in the paper itself?
I still don't understand why there could be so much difference in reliability of measured values. We are just measuring execution time of small C codes, which basically don't have conditional branch or memory access. Execution time is highly reproducible. If this were java, there would be many things to consider like JIT or garbage collection. How much reliability do you need?