wren icon indicating copy to clipboard operation
wren copied to clipboard

Change list.sort to use Hoare partitioning

Open PureFox48 opened this issue 1 year ago • 11 comments

#1141 identified two problems with the existing implementation, namely that is was very slow when presented with lists which were already sorted or all the same.

The purpose of this PR is to fix these problems by switching from Lomuto to Hoare partitioning. As far as 'random' lists are concerned, the figures in the above issue indicate that there will be no noticeable change in performance for small lists and possibly a small improvement for large lists.

This change will be invisible to users as only two private methods are affected. No change is required to either the documentation or tests.

PureFox48 avatar Mar 03 '23 10:03 PureFox48

How do I change wren_core.wren.inc to match these changes? It says not to edit it as it will be generated automatically but that doesn't seem to have happened here.

PureFox48 avatar Mar 03 '23 10:03 PureFox48

util/wren_to_c_string.py src/vm/wren_core.wren.inc src/vm/wren_core.wren

But wait I have some minor style code to submit, before you rebase your change.

mhermier avatar Mar 03 '23 10:03 mhermier

Ah, so there's a Python utility to do this which we need to run manually. OK, thanks I'll do that later after we've agreed any changes.

I've based the implementation here on the Wikipedia pseudo-code which is said to be taken from the 'Introduction to Algorithms' book by T.H.Cormen and others.

As it uses do/while, which we don't have in Wren, to ensure certains lines are run at least once, I've used an infinite loop with a reversed logic condition at the end to simulate that.

Also, don't forget that we can't change the comparator function here from a simple true/false to three-state as it will break existing code which uses a custom comparator.

It's not necessary to do so in any case as quicksort is an unstable algorithm and so the order of elements comparing equal is not guaranteed to be maintained.

PureFox48 avatar Mar 03 '23 12:03 PureFox48

@mhermier

Any suggested changes here?

If not, I'll regenerate wren_core.wren.inc and add it to the PR.

PureFox48 avatar Mar 07 '23 10:03 PureFox48

Well there the minor code changes, in the reviews.

Else there is the proposition in the reviews, that could maybe simplify code a little. Thought I didn't checked if it was safe.

mhermier avatar Mar 07 '23 20:03 mhermier

What reviews are those?

The only comment you made about the code in https://github.com/wren-lang/wren/issues/1141 was:

It should optimize a little bit more, because functional is a little bit sub-optimal in wren).

I couldn't see how it could be optimized myself because we have no choice but to use a comparator function.

PureFox48 avatar Mar 07 '23 23:03 PureFox48

Look at the pull request log. There are 3 reviews opened. 2 about minor code style, with missing space around binary operators. And 1 about a possible improvement, where I wonder if adding an early check, could avoid to have 3 condition, instead of 1. Last one is an open question.

mhermier avatar Mar 08 '23 07:03 mhermier

Well there must be something wrong somewhere as github is telling me there are No reviews for this PR.

Are you sure you've posted them and they're not just in draft?

PureFox48 avatar Mar 08 '23 09:03 PureFox48

OK, thanks, I've incorporated all three of your points into the PR.

Everything seems to be working fine including the degenerate cases when the list is empty or has only one element.

PureFox48 avatar Mar 08 '23 19:03 PureFox48

Seems good enough. If it was an external library, I would have changed the algorithm to some hybrid, mode efficient with smaller lists, and conserve quick-sort for the large lists .

mhermier avatar Mar 08 '23 20:03 mhermier

Well, back in the day, sorting implementations would typically use insertion sort for small lists (less than 10 elements say) and quicksort for larger lists. We could still do that here (inserton sort is only about 12 lines) but with modern hardware you'd only notice the difference if, say, you were sorting a large number of small lists in a loop. Frankly, I doubt whether it's worth it for core.

PureFox48 avatar Mar 08 '23 22:03 PureFox48