Yusuf Redžić comments

Results 80 comments of


Yusuf Redžić

Add aarch64 neon simd support

I've written the initial implementation with 4x unrolling and handling the main loop so that the accesses are aligned. The structure is basically the same as the SSE2 implementation. These...

If anyone wants to see the code used in the benchmark, it is available here: https://github.com/redzic/memchr/tree/aarch64-neon I think from here, besides just cleaning up the code, memchr2, memchr3, and memrchr,...

Add initial aarch64 neon support

~~@BurntSushi do you know how to run the entire benchmark suite for the functions in this crate only (i.e., not libc or anything else) so that they are formatted like...

Add initial aarch64 neon support

Here are the initial benchmark numbers (tested on an 8-core M1 Pro): So it seems like we only lose for small/tiny haystacks with more common occurrences. I think reducing unrolling...

Add initial aarch64 neon support

I think this is what is remaining for this PR: 1. Ensure this works on big-endian (since there are apparently some big endian ARM processors). I'll try to see if...

Use of `AtomicPtr` in `unsafe_ifunc` prevents memchr from being inlined when compiled with avx enabled

I will try to do some benchmarks later, but one use case of inlining memchr is that it could allow LLVM to specialize the function for haystacks with statically-known lengths,...

[Meta] Parity with aomenc

Sorry for unpinning 😅 I accidentally clicked the button, I pinned it back

Add initial aarch64 neon support

@CeleritasCelery Any help would be good, but I guess the "bureaucratic" stuff like the build.rs version check is what's preventing this from being merged now. Or maybe @BurntSushi has been...

Crashes for some chunks

This appears to be a memory safety bug in aomenc, as from your logs, it says that it SIGSEGV-ed on those chunks. Unfortunately there is nothing av1an can do to...

rav1e_config_set_sample_aspect_ratio() from the C api silently accepts unsupported values

From rav1e's point of view, it seems easier to deprecate rav1e_context_new() and add another function similar to it, but that returns the possible error in the configuration, maybe like a...