cpp-sort icon indicating copy to clipboard operation
cpp-sort copied to clipboard

More comprehensive benchmarks

Open Morwenn opened this issue 7 years ago • 1 comments

The benchmarks page on the wiki is seldom updated, and only compares the sorting of integer and floating point values. We need better and more comprehensive benchmarks, with the abillity to generate distributions for more kind of types (pairs, arrays, strings, etc...) so that we can better compare the different algorithms and where they are better than others.

A first easier step would be to group sorting algorithms by families to make some benchmarks more relevant since the different families of algorithms generally don't have the same goals.

Some other interesting results could be highlighted: for example, is it stably sorting by applying stable_adapter to pdq_sorter faster in many cases than using an inherently stable sorting algorithm? When is indirect sorting faster? When is using a Schwartz transform faster? I think that some pieces of the library can answer many use cases, but we need many more benchmarks to highlight which tool might be the best for a specific scenario. With a comprehensive benchmark suite, the dedicated wiki page could even be turned in a whole section with several sub-articles.

Another problem: I remember an oldish benchmark that showed how sorting strings with the same algorithm could have greatly different results depending on which compiler was used, which might make the benchmarks more complicated once again...

Ideally we would need a tool where we can select the different parameters before running the benchmarks:

  • Algorithm
  • Adapter if any
  • Type of data
  • Distribution of data
  • Compiler
  • Compiler options

Unfortunately that would require a full tool, and probably a dedicated website too. While this may be of value, I most probably don't have the time nor the skills required to do that. The benchmarks on C++ Hut are a good example of what I would like to achieve.

Morwenn avatar Mar 02 '18 14:03 Morwenn

We are still far from comprehensive benchmarks, but there have been some improvements recently:

  • A new benchmark for stable sorts that don't allocate heap memory was added to the wiki as part of issue #175.
  • A new dist::to_long_string projection was added to the benchmark suite by 6b5862c3e50816ba7f8a752cb3d17c2853fc862b. It makes it possible to reuse the existing distributions to generate collections of long strings with a long common prefix. This is a good start to show expensive comparisons.

Morwenn avatar Apr 07 '21 17:04 Morwenn