Aumetra Weisman
Aumetra Weisman
That commit should get rid of a bunch of compile issues related to adding generic types to structs that don't take them. I was testing this on an x86 system,...
I'm still trying to figure out what the best way forward is to make `to_bitmask64` on NEON CPUs work.
@liuq19 Would you be up to benching the speed of an implementation for NEON that runtime dispatches the bitmask creation? We could technically cache the result whether NEON (or any...
Hacked in the version that dispatches on each bitmask call. Maybe the performance hit is too severe to justify..
Okay, I'm not sure _why_ this is broken? It's on ARM64, right? I guess I'll have to whip out the cross-compilation for now. I don't own a suitable ARM machine...
Now I just need to find a way to properly express this in trait form, preferrably very generic.
That's weird. On my local machine, the change is somewhat in the ballpark of ~3-4%, which is _acceptable_ (I'd need to profile it to get a closer idea of what's...
Already wasn't active due to my global Cargo config. But for the benches above I set `-C target-cpu=native`. I can somewhat reproduce your findings when I toggle `-C target-cpu=native` for...
So it is _much_ slower without `target-cpu=native` but that is to be expected with all the CPU-specific optimizations LLVM can do. But the runtime detection only sets performance back by...
Never mind, I get what you mean. Let me look into it.