blurhash
blurhash copied to clipboard
Optimize the Kotlin implementation to double performance
BlurHashDecoder optimizations
- Precompute cosines tables before composing the image so each cosine value is only computed once.
- Cosines are stored as Float instead of stored as Double and transformed to Float after multiplication.
- Compute cosines tables once if both are identical (for square images with the same number of colors in both dimensions).
- Store colors in a one-dimension array instead of a two-dimension array to reduce memory allocations.
- Use a simple
String.indexOf()
to find the index of a Base83 char, which is both faster and needs less memory than aHashMap
thanks to better locality and no boxing of chars. - Replace
value.pow(2f)
with fastervalue * value
- No cache is used, so computations may be performed in parallel on background threads without the need for synchronization which limits throughput.
Problems with the old cache
The old implementation used a cache that had 3 issues that evaded the tests:
- Access to the cache was not synchronized, thus two concurrent calls on different threads could result in cache corruption or reading incorrect values from a half-filled array. Since Blurhash is an expensive computation, the main use case is to call it from a pooled background thread.
- The cache key was the image dimension (width or height) multiplied by the number of colors in that dimension. This means the cache will hit the same result for an image width of 32 pixels with 8 colors (256) or 64 pixels with 4 colors (also 256) but this is incorrect because even if the array size for these configurations is the same and contains the same values, these values appear in a different order. The cache will thus return incorrect values for one of the two configurations. The correct cache key to avoid collisions would have been something like
(width * 10) + [number of colors]
, where 10 is the max colors number. - Even when the cache was disabled (
useCache
is false), the code still allocated memory but didn't write to it. This had two bad effects:- Memory was wasted.
- Worse: when the blurhash method was called two times in a row for a compatible configuration, first with the cache disabled then with the cache enabled, the second call will hit the cache but read empty values, resulting in an incorrect computation.
Fixing and keeping the cosines cache would not improve the single-threading performance much, but would make the throughput lower for multi-threading performance because of synchronization. Thus it makes more sense to just remove the cache completely, now that the algorithm is faster without cache than it was before with the cache.
Benchmarks
Simple: 4x4 colors, 32x32 pixels output. Complex: 9x9 colors, 256x256 pixels output.
Pixel 7 (Android 14)
517 662 ns 27 allocs Trace BlurHashDecoderBenchmark.originalSimpleNoCache
225 079 ns 27 allocs Trace BlurHashDecoderBenchmark.originalSimpleWithCache
108 153 ns 8 allocs Trace BlurHashDecoderBenchmark.newSimple
162 140 992 ns 92 allocs Trace BlurHashDecoderBenchmark.originalComplexNoCache
46 720 337 ns 92 allocs Trace BlurHashDecoderBenchmark.originalComplexWithCache
13 124 883 ns 8 allocs Trace BlurHashDecoderBenchmark.newComplex
Nexus 5 (Android 6)
6 389 579 ns 26 allocs Trace BlurHashDecoderBenchmark.originalSimpleNoCache
2 358 534 ns 26 allocs Trace BlurHashDecoderBenchmark.originalSimpleWithCache
1 546 684 ns 7 allocs Trace BlurHashDecoderBenchmark.newSimple
2 258 036 875 ns 91 allocs Trace BlurHashDecoderBenchmark.originalComplexNoCache
404 196 250 ns 91 allocs Trace BlurHashDecoderBenchmark.originalComplexWithCache
185 823 959 ns 7 allocs Trace BlurHashDecoderBenchmark.newComplex
The source code of the benchmark can be found here.
Other code changes
- Replace custom function
timed()
with standardmeasureTime()
in demo app - Replace
junit.framework.Assert.assertTrue
withorg.junit.Assert.assertArrayEquals
.
Update configuration to modern standards
(So the project can build on modern versions of Android Studio)
- Upgrade dependencies: Kotlin 2.0.0, AppCompat 1.7.0, Androidx Test Runner 1.6.1, JUnit 4.13.2, Android Gradle Plugin 8.5.0, Gradle 8.7.0
- Bump
minSdkVersion
to 21 andtargetSdkVersion
to 34 - Replace abandoned jcenter repository with mavenCentral
- Add
namespace
to replacepackage
inAndroidManifest.xml
- Target Java 11 bytecode
- Remove obsolete Kotlin Android Extensions plugin
- Disable Jetifier.