Radzivon Bartoshyk
Radzivon Bartoshyk
I've seen `_mm_mulhi_epu16` run 300–500% slower than `_mm_mulhi_epi16` on AMD CPUs a few times probably because it's microcoded on some chips. And these are modern CPUs, so I'm trying to...
If you want 0.5 ULP math [here](https://github.com/awxkee/pxfm) is also Rust source. Not everything comes from CORE-Math but it should be correct for nearest rounding mode and is pretty straightforward to...
> It looks like Glide has [an integration for Android](https://github.com/bumptech/glide/blob/master/integration/avif/src/main/java/com/bumptech/glide/integration/avif/AvifByteBufferBitmapDecoder.java) that already does this, Glide integration do not support HDR images and ICC profiles. TBH basic AVIF support is relatively...
Thanks. But I didn't use those flags so it won't help.
Thanks. But the issue with libjxl itself still in force. I had to rebuild libraries last week due to google regulations and it still adds 1mb for aarch64. https://github.com/awxkee/jxl-coder/commit/7453892b0ac199dfe047c4e98dd75c1d3a8e7157
Sure. Change NDK_PATH to yours and path to llvm-strip. FWIW I'm just building versions checked out from tags, nothing interesting. Build script ```bash #!/bin/bash set -e export NDK_PATH= export NDK=$NDK_PATH...
Unfortunately can't confirm that it fully works. With everything disabled it's 2mb. While it's much more reasonable it's still ~200kb bigger than it was once ( 1.8mb ). Since 70%...
You could also use any bloat profilers as https://github.com/google/bloaty. I believe it makes more sense to look for methods with unnecessary generics or templates, which tend to bloat the binary...
LTO is not a silver bullet and can work both ways. With libjxl it actually works the other way increasing the size from 2 MB to 2.1 MB. It’s more...
> Shouldn't LTO also give some performance benefits too? Sure, by aggressively inlining methods that are normally not considered inlinable from other object files, which makes the binary larger.