Casey Muratori

Results 24 issues of Casey Muratori

Modern Meow Hash (v0.4 and up) is actually faster than Mum on small inputs now in randomized testing. The version checked into this tree is a very old version and...

Probably copy pasta from u64, but it should not have the "unsigned". \- Casey

Because we had to change the vpxor to a vxorps to ensure AVX-1 compatibility, we probably want to go ahead and use PS everywhere now, because we don't want any...

Right now there is a really hacky thing called StartGateCounter that is used to attempt to give threads a chance to receive their instructions and start somewhat in lockstep where...

At the moment, I haven't looked at the read/write ASM so I want to remember to look to see if the xor's use a memory op to ensure 2-load 2-store...

At the moment, blandwidth doesn't probe for NUMA patterns on the CPU (or across CPUs). It would not be a particularly difficult thing to do, it would just be a...

The code in blandwidth is designed to make it easier for someone to make a linux_blandwidth.c file that would allow blandwidth to be build on Linux. However, I am far...

Although in theory the 128-bit routines in Blandwidth are written to ensure they can use 2 read and 2 write ports on every cycle where available, and CLANG produces ASM...

To allow for hashing files larger than available memory, and to prevent possible incompatibilities with ftell on 32-bit builds, I'd like to stop reading files into memory wholesale and start...

1) We do not know why CLANG generates subtract-from-pointer loads instead of add-to-pointer loads. It _says_ it's faster when _it_ generates the code inside meow_bench, but when we extract the...