Sergey "Shnatsel" Davidoff
Sergey "Shnatsel" Davidoff
I've opened https://github.com/image-rs/jpeg-decoder/pull/168 for parallelizing IDCT. We can combine it with SIMD later to hopefully outperform libjpeg-turbo in the future. Sadly it doesn't do all that much for performance because...
They're inlined! `perf` is just that good. I'm using this in Cargo.toml: ``` [profile.release] debug=true ``` and profiling with `perf record --call-graph=dwarf` so that it uses debug info to see...
I'm afraid that JPEG decoding will always be significantly slower in WASM than it is in native code. It's very computationally expensive and relies on SIMD and/or parallelization to perform...
As of version 0.2.6, on a 6200x8200 CMYK image, `jpeg-decoder` is actually faster than `libjpeg-turbo` on my 4-core machine! Without the `rayon` feature it's 700ms for `jpeg-decoder` vs 800ms for...
Oops. I fear the celebration has been premature. Now that I've tested it on a selection of photos, it appears that `jpeg-decoder` is still considerably slower than libjpeg-turbo even with...
The function in question is now also doing shuffling (i.e. it gathers results from 4 vectors, each with its own component, and interleaves them in the output). https://github.com/image-rs/jpeg-decoder/blob/09a2c2d81c89014f1477a2507b20f5fa9e3d8e01/src/decoder.rs#L1280-L1296 The shuffling...
It might be worth considering splitting the shuffling and the actual color conversion, so that the color conversion coukd still be vectorized.
I've scanned ~10,000 JPEGs and failed to find any file that would trigger this codepath, so I'm just going to assume this is a rarely taken codepath that's not really...
Also I think this is the only JPEG mismatch across multiple corpora of ~65,000 JPEGs total, which is *incredibly* impressive.
I'll see if I can run a before/after test.