Reduce allocations during encoding
I didn't know whether to expect #10 to be accepted, but since it was, here are some similar optimizations (with more detailed explanations).
Summary
Using the iterator extend pattern can drastically reduce the number of allocations (as well as total memory allocated!) when generating the hash for image (in the encode function).
Status Quo
Previously, encode_base83_string allocated a new String every time it was called. This incurred a heap allocation at least 3 times, despite the string only being between 1 and 4 characters long. Additionally, another heap allocation would be incurred for every encoded ac component.
Optimizations
-
Preallocating the
hashstring so that it never has to be resized. Visually counting up the number of characters to be placed in the string yields this equation for the hash string size:1 + 1 + 4 + 2(factors). Specifically, 1 for the size flag, 1 for quantized maximum value (or a zero), 4 for the dc component, and 2 times the number of factors (i.e.2 * ac.len()). -
Extending the string with a
chariterator rather than appending aString. Changing the return of the internal functionencode_base83_stringto the opaque typeimpl Iterator<Item=char>means that the characters of the hash string can be lazily generated without intermediate heap allocations for eachString. The iterator is then consumed by theString::extendmethod, which takes into account the size hint of the iterator; since the size hint will be exact (checkRangeFull::size_hintandMap::size_hint), this means that the method will always know the exact size to reserve during the extension. However, since we already preallocated, we will never have the reserve anything. This means thatextendwill simply be a safety size check (simply an integer comparison) and then a bunch ofpushcalls (usually just 1 or 2 anyway).
Implications and Benefits
All in all, we reduce the number of allocations from O(n) to O(1) (where n is the number of ac components). This is beneficial because the WASM memory model punishes frequent, small allocations. Additionally, using less memory when using wee_alloc is beneficial because wee_alloc's allocation is an O(n) operation. Moreover, wee_alloc has a two-word overhead for each allocation, meaning we have a constant four words of overhead rather than an unbounded amount.