awkward icon indicating copy to clipboard operation
awkward copied to clipboard

For NumPy compatibility, ak.sort's 'stable' (bool) should be 'kind' (str)

Open jpivarski opened this issue 4 years ago • 2 comments

The interface to ak.sort and ak.argsort should be more like np.sort and np.argsort.

  • The Awkward functions have an ascending (bool) argument that isn't really necessary, since one can just negate the input. This ascending argument is passed all the way down to the underlying kernel, so the code could be simplified by eliminating it. This is not super-important because we're allowed to have more features than NumPy and the code's already there. But if writing the GPU kernels is simplified by dropping the ascending argument, then let's do it.
  • The Awkward functions have a stable (bool) to choose between C++'s stable and C++'s unstable sorting algorithm. NumPy has traditionally named explicit algorithms in a kind (str) argument, but since 1.15, it seems to be moving in the direction of describing only features, like "stable" vs "unstable", rather than explicit algorithms. @ianna has been thinking about GPU implementations of these sorting algorithms, and it looks like quicksort would be the most doable, so what we offer maps nicely onto the names NumPy defines:
    • NumPy's "stable" → our C++ stable algorithm, available only on CPU
    • NumPy's "mergesort" is a synonym for "stable", but included only for backward compatibility → an informative error message, since our stable algorithm isn't exactly mergesort
    • NumPy's "heapsort" → our C++ unstable algorithm, available only on CPU, which really is heapsort
    • NumPy's "quicksort" → the new algorithm @ianna is planning to implement by hand on CPU and GPU, the only algorithm available on the GPU.

At this stage, at least, I'll deprecate the stable (bool) option in favor of a kind (str), but leave ascending as it is.

Before doing this, I should create a formal roadmap with scheduled releases, since the deprecation date would not the Feb 1 date, but the one after that, whatever that is (#616).

jpivarski avatar Dec 22 '20 14:12 jpivarski

@ianna, since the NumPy default is "quicksort", I can't do this until a quicksort exists.

But we now have a formal roadmap, so if quicksort is available in January or February, then we can set the deprecation for 1.2.0 (April, 2021).

jpivarski avatar Dec 22 '20 21:12 jpivarski

@jpivarski - incidentally NumPy quicksort has been changed to introsort: https://numpy.org/doc/stable/reference/generated/numpy.sort.html

It's the same as in AwkwardArray :-)

ianna avatar Jan 22 '21 11:01 ianna

@jpivarski I think we can close this: Awkward doesn't implement the same sorting routines as NumPy, but we do support kind. Now that we have dedicated NEP-18 entrypoints, we translate the kind to a stable vs non-stable sort.

agoose77 avatar Nov 08 '23 11:11 agoose77