Sergey "Shnatsel" Davidoff
Sergey "Shnatsel" Davidoff
Note to future self: try converting the intermediate values to wider types to make the whole vector wider and coax the compiler into vectorizing it.
i16 is wide enough for `unfilter_paeth6`, so it does get vectorized, but then I run into another problem: it doesn't get inlined, which also tanks performance. I can force inlining,...
Okay I've finally figured out the 3 and 6 bpp case, PR up: #513 Edit: nevermind, turns out the original was already vectorized, my changes actually scalarized it and that...
I measured the performance with and without `unstable` feature to get a sense for the improvement we might expect from this change. 8-bit RGBA doesn't seem to benefit at all,...
I couldn't get the straightforward adaptation of the 3bpp algorithm to autovectorize: https://godbolt.org/z/en417qP3b
If I switch from i16 to i32 as the type on which all operations are performed, it vectorizes fine on modern CPUs: https://godbolt.org/z/T67Gn74ze But in this form it completely fails...
After #539 there is very little benefit to using the `unstable` feature in the common cases. On ARM it does more harm than good and is disabled entirely. On x86...
Well, that's not right - it collates two distinct vulnerabilities. But that's an issue with the upstream OSV data, not our code: https://osv.dev/vulnerability/GHSA-9328-gcfq-p269 I don't get where they even got...
Also, `FrameInfo` seems to be a better name for it than `OutputInfo`, and we might want to change it while we're at it.
WebP, PNG and JPEG encoders allow setting ICC profile as of `image` v0.25.6