cpp-sort icon indicating copy to clipboard operation
cpp-sort copied to clipboard

Parallel sorting algorithms

Open Morwenn opened this issue 10 years ago • 3 comments

At the time of writing, all of the sorting algorithms provided by cpp-sort are sequential. Having parallel sorting algorithms too would be great, but how do we want to implement them? No definitive design but several ideas:

  • The simplest way would be to make a parallel sorter "just another sorter". The documentation would tell that it can work in parallel but with no associated semantics from the library.
  • The sorters could instead take execution policy parameters as the ones proposed in the Parallelism TS. Moreover, it may allow them to benefit from executors in the future if this design makes it into the standard library.
  • While it may introduce yet another layer of complexity and maybe some additional SFINAE, the proposed trait std::is_execution_policy should ease the implementation a bit: for example cppsort::sort could simply use tag dispatch to forward the parameter appropriately and do the current checks and layer deeper. Hope it can work well with hybrid_adapter too.
  • Some sorters such as merge_sorter or std_sorter would probably keep the same name and provide both sequential and parallel versions of the algorithm. Other sorters will probably be parallel-only and implement some specific sorting algorithms.
  • sorter_facade could probably make some default overloads when no execution policy is provided or when std::vec is passed. Not sure in which sense.
  • On the other hand, some sorters being explicitly parallel, not being able to use them without an explicit execution policy can be a bit bothersome. But, executor design and stuff... I don't think a user-provided default execution policy is viable.

Annnnnnd, once again, many design questions that won't have an answer right now. Anyway, these algorithms probably won't see the light before the parallelism TS is widely supported. The work will probably start in an experimental branch and won't be merged before long.


New thoughts for a user-friendly yet flexible design: make sorters take execution policy and allow the to call different but related overloads depending on the policy (for example merge_sorter has obvious sequencial and parallel versions). However, do not default the execution policy to std::sequential_execution_policy: instead add a default_policy trait to sorters and make sorter_facade call the overload for that default execution policy when none is provided; it should still be std::sequential_execution_policy for most sorters, but some sorters could implement algorithms that are parallel by nature and don't have a sequencial counterpart, making std::parallel_execution_policy the obvious default for them. When a sorter does not provide a default execution policy, make sorter_facade call an operator() overload that does not take an execution policy if such an overload exists, and trigger an error otherwise (soft or hard, TBD).

Such a mechanism should ensure that the sorters mostly do what we expect them to do while still remaining flexible. Moreover, the last point should also allow to use old sorters that are not meant to work with execution policies without having to modify them.

A small problem: sorter_facade won't be able to able to take advantage of a specialized sorter_traits to find the implementation since it is often specialized for the sorter and not the sorter implementation. One woud have to specialize it for the sorter implementation instead. It could probably be done (I'll have to analyze it first) but then it would require to change the tutorials to advise to specialize sorter_traits for the sorter implementation and not for the sorter itself. Then, it should be possible to use sorter_traits in sorter_facade to find the default execution policy.

A sorter implementation would then look like that:

struct spam_sorter_impl
{
    template<typename Iterator>
    auto operator()(std::sequential_execution_policy,
                    Iterator first, Iterator last) const
        { /* ... */ }

    template<typename Iterator>
    auto operator()(std::parallel_execution_policy,
                    Iterator first, Iterator last) const
        { /* ... */ }

    using default_policy = std::sequential_execution_policy;
};

The proposed executor_traits uses execution_category, so default_policy could be named default_execution_category instead to follow the same naming scheme. I doubt the name would be used that often, so we might as well have consistent names even if they are a bit long.

How to handle the type-erasing container std::execution_policy? Every implementation I have seen implements algorithms as virtual members of an execution policy so that polymorphism works, but since we do not implement the execution policies, it seems hard to extend. Open methods could be a solution, but if they ever get standardized, it will likely not be before years and years. Update: apparently std::execution_policy was not voted into C++17, which means that we won't have to handle the problem before a while.

Morwenn avatar Nov 09 '15 18:11 Morwenn

For the HPX library (https://github.com/STEllAR-GROUP/hpx) we're looking for help implementing the parallel sort algorithms for our implementation of the C++17 parallel algorithm specification (which is almost complete, sans sorting, see https://github.com/STEllAR-GROUP/hpx/issues/1141). Would you be interested in lending your hand and applying your experience?

hkaiser avatar Jan 11 '18 14:01 hkaiser

@hkaiser Hi, unfortunately I don't have any experience implementing parallel sorting algorithms; even the sequential ones I took from somewhere else for the most part. On the other, I can give you a list a repositories with supposedly high-quality parallel sorting algorithm implmentations where people might actually be able to help you:

  • Boost.Sort has three parallel, some stable, some unstable: indirect_block_sort, sample_sort and parallel_stable_sort. I don't know whether they're widely used, but the guy who wrote them also wrote spreadsort, which is a fairly good algorithm, so I'd be quite confident in the quality.
  • In-place Parallel Super Scalar Samplesort (IPS⁴o) claims to be one of the fastest sorting algorithms around. The sequential version can already outperform pdqsort, and the parallel version (the repo has both) should be even faster.
  • If you're looking for SIMD sorts which be be useful for floating point values, floki implements an Access Aligned Sort which seems fairly efficient. That said it's not really suitable for an std::sort-like function since it doesn't take a comparison function.

I hope that helps a bit if you need to find ideas or people to contribute (all of those projects are under permissive licenses). I may have written a sorting algorithms library, my work was mainly to aggregate existing sorting algorithm implementations and to burry them under template layers to provide a smooth (albeit overkill) interface, so I wouldn't be of much help for actually implementing sorting algorithms, sorry. Hope you will find someone to contribute :)

Morwenn avatar Jan 11 '18 15:01 Morwenn

@Morwenn thanks for the links! Much appreciated!

hkaiser avatar Jan 11 '18 15:01 hkaiser