fastutil icon indicating copy to clipboard operation
fastutil copied to clipboard

Implement counting sort for boolean and byte arrays

Open techsy730 opened this issue 5 years ago • 5 comments

boolean and byte types have a small enough keyspace that we can reasonably always be able to quickly allocate an array for all possible elements. (or for boolean, two ints on the stack) And you can't beat O(n) for sorting.

Maybe it can be provided for short too, with a warning in Javadoc that this may allocate a lot more memory then the size of the array for small arrays.

techsy730 avatar Jul 17 '19 23:07 techsy730

You are absolutely right. Did you do any kind of benchmark?

vigna avatar Jul 19 '19 09:07 vigna

@vigna After glancing at this issue, I was inspired and devised a method of sorting a boolean[] in O(n) time with O(1) space. I'd be happy to open a pull request to replace what exists (after performing benchmarks with JMH), or to add the following methods:

public static void sort(boolean[] booleans) {
    sort(booleans, 0, booleans.length - 1);
}

public static void sort(boolean[] booleans, int fromIndex, int toIndex) {
    // Add appropriate range check before attempting to sort
    for (int head = fromIndex, tail = toIndex; tail > head; tail--) {
        if (booleans[tail]) {
            continue;
        }

        // Move 'head' to first 'true' element in the array
        while (!booleans[head]) {
            if (++head == tail) {
                // The array is now sorted
                return;
            }
        }

        // Swap elements at 'head' and 'tail'
        booleans[head] = false;
        booleans[tail] = true;
    }
}

jhg023 avatar Nov 03 '19 18:11 jhg023

OMG I completely lost track of that. Yes, that would be nice as a replacement for sort(). I don't understand the continue—it seems a NOP.

vigna avatar Dec 16 '20 17:12 vigna

@vigna I had to re-familiarize myself with what I had written since it's been so long, but it seems that I only perform a swap when booleans[tail] is false.

This makes sense, as all false elements should be placed at the beginning of the array, and all true elements should be placed at the end.

If we attempt to swap when booleans[tail] is true, then it would result in an unsorted array.

jhg023 avatar Dec 16 '20 17:12 jhg023

Ahhhh I missed the { at the and of the for. I thought the if was the only instruction 🤦🏻‍♂️.

vigna avatar Dec 16 '20 17:12 vigna