secp256k1 icon indicating copy to clipboard operation
secp256k1 copied to clipboard

Add secp256k1_pubkey_sort

Open jonasnick opened this issue 2 months ago • 8 comments

This PR adds a secp256k1_pubkey_sort function the the public API which was originally part of the musig PR (#1479). However, I opened a separate PR because it adds internal functions that are also used by the WIP silent payments module.

jonasnick avatar Apr 16 '24 19:04 jonasnick

Concept ACK. This is useful independently of MuSig and Silent Payments. Also for users with no standard library, e.g., on hardware wallets.

cc @roconnor-blockstream who wrote most of this code


Just out of curiosity, I checked what glibc actually does. They have Quicksort, Insertionsort, Mergesort, and Heapsort in their code base. If malloc fails, they'll always fall back to Heapsort. I think they had been using Quicksort and Mergesort for decades (perhaps selected depending on the platform?), but they changed their mind quite often lately, and this was indeed related to degenerate and adversarially constructed inputs:

  • Oct 2023: They added Heapsort and used it to implement Introsort. Introsort is basically Quicksort two fallbacks: 1) a fallback on some O(n log n) sorting algorithm in degenerate cases, and 2) a fallback on Insertion sort for small arrays (https://github.com/bminor/glibc/commit/274a46c9b25ab733a1fb9fb1497f1beecae30193). They had also removed Mergesort, making Introsort the algorithm used everywhere. (https://github.com/bminor/glibc/commit/03bf8357e8291857a435afcc3048e0b697b6cc04)
  • Nov 2023: The logic for the fallback to Heapsort was tightened, and a test with a degerate array. (https://github.com/bminor/glibc/commit/64e4acf24da15c11cb83f933947df3b2e8a700cd: "To avoid a denial-of-service condition with adversarial input, it is necessary to fall over to heapsort if tail-recursing deeply, too [...]")
  • Jan 2024: They reverted to Mergesort because it turns out that people rely on the sorting being stable, even though this is not guaranteed by the standard, and even though their Heapsort fallback won't guarantee it anyway. (https://github.com/bminor/glibc/commit/709fbd3ec3595f2d1076b4fec09a739327459288)

This means that the current libc is probably fine. But old versions may be problematic, and other implementations of the standard library may also be problematic. If anything, all of this confirms that Heapsort is a good choice, and that we should not rely on the standard library here.

real-or-random avatar Apr 16 '24 23:04 real-or-random

A wild idea: What if we used something simple(like shellsort or even bubblesort) that is very easy to verify, and has good performance in the best case (when it's already sorted) and then we recommend users that use huge list of pub keys to sort them before passing them in with the stdlib sort function of the platform their using.

Needing to reimplement and maintaining highly efficient in place sorting every time is very sad

About it being useful, I'll say that embedded rust still gives you sorting via libcore, and I think every embedded C toolkit has some sort of sorting utility, but I might be wrong on this

elichai avatar Apr 18 '24 15:04 elichai

A wild idea: What if we used something simple(like shellsort or even bubblesort) that is very easy to verify, and has good performance in the best case (when it's already sorted) and then we recommend users that use huge list of pub keys to sort them before passing them in with the stdlib sort function of the platform their using.

Hm, I think this gives us the worst of both worlds in the end:

  • We have to maintain the implementation of sorting algorithm.
  • We still tell users to rely on the stdlib.

Also, I doubt that Shellsort or Bubblesort are much easier to implement than Heapsort. (And look at the nature of the bug I found: The swap was wrong, not the Heapsort itself. And you'll need a swap for every algorithm.)

Needing to reimplement and maintaining highly efficient in place sorting every time is very sad

Yeah, but that's because C is sad (sometimes).

real-or-random avatar Apr 19 '24 08:04 real-or-random

I'll echo @real-or-random 's point and add:

and then we recommend users that use huge list of pub keys to sort them before passing them in with the stdlib sort function of the platform their using.

I think this is fine if the sorting step is a convenience, but in the case of the silent payments module sorting is a required step for generating the outputs correctly. For this, I think it's far better to have sorting handled inside the module vs requiring the user to sort before passing the public keys. Doing the sorting for the users eliminates a difficult to catch error and we'd need to sort anyways to be able to detect and alert the user that they did not sort (or sorted incorrectly).

josibake avatar Apr 19 '24 12:04 josibake

I added a description of the interface to the hsort doc. Apparently mac OS uses a different order of arguments.

jonasnick avatar Apr 19 '24 18:04 jonasnick

Doing the sorting for the users eliminates a difficult to catch error and we'd need to sort anyways to be able to detect and alert the user that they did not sort (or sorted incorrectly).

You can test that a list is sorted in O(n) time, but sorting (by comparisons) take O(n log n) time

That said, I do agree that we should just sort ourselves, in O(n log n) worst case time, with constant memory overhead.

roconnor-blockstream avatar Apr 19 '24 20:04 roconnor-blockstream

Rebased on master and added change log entry.

jonasnick avatar Apr 22 '24 15:04 jonasnick

ACK 7d2591ce12d8a9b85f210cf9d678e91cee125ee9

sipa avatar May 03 '24 15:05 sipa