Nathan Moinvaziri
Nathan Moinvaziri
> Having a function table entry for inflate_table probably wouldn't be the worst thing in the world but you can get away without it. It would increase the code size...
Here is Quick Benchmark: https://quick-bench.com/q/DW80QBZoRy1UBwUEH-JMTeqYDGA Shows loop unrolling performs poorly at least on x86. AVX2 version 1.5 times faster than scalar version.
I have implemented the vectorized versions of code length counting.
AFAIK 286. I couldn’t get the godbolt code working it kept crashing so…
Ok, I threw it back in, lets see what happens, maybe it was something else.
@dougallj that seems to have done the trick! Thanks.
@dougallj here is the error on RISC-V. https://github.com/zlib-ng/zlib-ng/actions/runs/19886000159/job/56993275451?pr=2037 ``` /home/runner/work/zlib-ng/zlib-ng/zbuild.h:132:24: warning: ‘asm’ operand 1 probably does not match constraints 132 | # define Z_TOUCH(var) __asm__ ("" : "+r"(var)) | ^~~~~~~...
With compiler explorer, `+rm` also suffers. So maybe I just, like you said need to only do it for ARM/x86.
I just excluded `Z_TOUCH` for mips and riscv since they were the ones having the problem. I figured it was simpler than listing all the architectures where it did work...