openzeppelin-contracts
openzeppelin-contracts copied to clipboard
Add Array.sort
Fixes #3490
Add sorting function for memory arrays
PR Checklist
- [x] Tests
- [x] Documentation
- [x] Changelog entry
The second commit cuts gas by ~1/3 by switching memory access from Solidity to Yul, there's no way around index checking overhead in pure Solidity.
The memory location is calculated in Yul using the formula: add(32, add(array, shl(5, idx))). It seems to be the sweet spot when it comes to gas. What I don't understand is why even slight changes to it cause a sharp gas increase, in the ballpark of 10% for the whole sorting algorithm. Formula like add(array, add(32, shl(5, idx))) should yield almost identical opcodes, but it somehow manages to be MUCH more expensive. Does anybody know why it is like this? I've tried compiling that snippet in the vacuum and looking at the opcodes, but they both look similarly efficient.
Indeed, I was considering adding a function that skips the length check when reading memory (or storage) arrays.
For the gas optimization part, I believe that the arguments should be consumed as early as possible to avoid reordering the stack. This means using them in the most nested operation (that is going to be performed first).
Have you tried add(32, add(shl(5, idx), array)), and if yes, was it better/worst?
Have you tried add(32, add(shl(5, idx), array)), and if yes, was it better/worst?
I just did, the results are marginally worse, around 2.5% for an entire sort of 100 elements. The shalowness on stack would make a lot of sense, especially when inlined.
Using unchecked arithmetic and array accesses on @Amxx's quicksort implementation seems to be more efficient than the heapsort implementation in this PR, at least in the array I'm testing with here. More testing and benchmarking is needed to conclude anything. We should look at the specific case of sorting an already sorted array, which is known to be bad for quicksort with the pivot selection we're using (always the leftmost item). We could consider selecting the middle element in the array to mitigate for that case.
After a deep dive into search algorithms and the implementations in https://g.solidity.cc/challenges/sort, I've started to think that our goal shouldn't be to maximize efficiency. It's hard to optimize for all cases, and in any case sorting in a smart contract should be discouraged. Instead, I'm thinking we should aim for simple auditable code that is reasonably efficient. In this regard, quicksort is quite attractive because of its simplicity.
@frangio Thank you for benchmarking it! What are the exact results?
Auditability is a good reason to choose quicksort. OTOH heapsort has a better worst case complexity of O(N log N) while quicksort has O(n^2). No matter how we tune it, the worst case scenario will probably be arrangeable, which may lead to attacks and gas usage surprises. Heapsort is safer in this regard and more predictable, which are nice properties if we're looking for a general-purpose algorithm which doesn't require diving deep into the problem.
Away from the computer right now so I can't share the exact results.
Definitely agree that the better worst case complexity of heap sort is a factor to consider. That said, I still wouldn't be comfortable with n log n either, it could still lead to vulnerabilities if n can be controlled by an attacker. On chain sorting should be used sparingly and with arrays known to be small, or in a path where the impact of a large array is contained.
What is needed to proceed here is to gain some more confidence that Quicksort would generally be more efficient, by running some more benchmarks:
- An already sorted array, or one that is nearly sorted
- Arrays of different sizes (what is the largest that can be sorted under the gas limit?)
Note that my gas-bench repo is benchmarking the sorting of an array that is defined in Solidity and this isn't very usable for running multiple tests with different arrays so it would really help to change it to load the array from calldata.
We also need to try different methods for selecting the pivot. Currently using the first in the array, this will be bad for sorted or near-sorted arrays. We can try using the item at the middle of the array. There are better methods such as the median of three where you take the median of the first, middle, and last items; this should be tried but may add too much overhead for small arrays which is our main focus here. (We could also try median of two?)
There is also the possibiliy to consider other partitioning algorithms, I've seen discussion about Hoare vs Lomuto partitioning that we should look into. One is said to be more efficient than the other, and I don't think we're using it. I don't know if the differences are relevant for the specifics of the EVM but we should try.
Proposals for other algorithms are also welcome, considering auditability as a main factor to optimize, with reasonable efficiency that isn't too far behind the most optimized implementations.
so it would really help to change it to load the array from calldata
@frangio Would you be open to PRs for your gas-bench repo? Something along the lines of setting up unit tests to run the bench marking w/ various inputs (and having the function load from call data as you suggest from above)
@burke-md Yes that sounds good!
This is based on an old commit. Replaced by #4846.