Caleb Zulawski

Results 80 comments of Caleb Zulawski

I diffed the asm for `unfilter_paeth3`, `unfilter_paeth_u8`, and `unfilter_paeth6` for `x86_64-unknown-linux-gnu`, `x86_64-pc-windows-gnu`, and `x86_64-pc-windows-msvc` and there are zero differences other than a couple extra calling-convention related moves on windows at...

Not to be flippant, but that example only uses a single SIMD arithmetic operation: ```rust *x += paeth_predictor_u8(state.a, b, state.c); ``` The actual work is still scalar: ```rust let mut...

Also, when examining the asm the SIMD version is still only 1 instruction longer, but some of the instructions move around. Since it's so similar I'm surprised it's noticeably slower....

I was able to test as far back as 0.6.0 and it had the same behavior.

Upon further inspection, it looks like ncurses is the only package in my entire package cache that has a placeholder dynamic dependency: ``` /Users/runner/miniforge3/conda-bld/ncurses_1738195744584/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/lib/libtinfo.6.dylib (compatibility version 6.0.0, current version 6.0.0,...

Looks like I was a little wrong, the placeholder is supposed to be replaced with the full path. I was able to fix my build by calling `pathlib.Path.absolute` on the...

It actually looks to me like the path is hard-coded, so I don't think that would work either unfortunately

I could see something like this being useful. The only dispatcher overhead vs this, however, should be an atomic load. Is there really that much overhead?

LLVM appears to have pclmul depend on sse2: https://github.com/llvm/llvm-project/blob/3fc0d94ce57de2d0841e77c8fda7feef2923c4e0/llvm/lib/Target/X86/X86.td#L187

I'm not totally sure what the best way to handle this is. Specifying a CPU is a convenient shortcut but simply filtering out unsupported features could result in undefined behavior....