JayDDee

Results 60 comments of JayDDee

Allium & Lyra2Z AVX512 & AVX2 are up to date with 2 stage blake256 prehash optimization using linear SIMD for the first stage and Nway parallel for the second. X17...

Many chained algorithms have redundant endian byte swaps that can be eliminated. Blake is often the first hash function in a chain and it either performs a bswap32 (blake256) or...

The blake family of core hash fucntions can be optimized with linear vectoring (one way). Blake256 & blake2s can use SSE2 while blake512 & blake2b can use SSE2 or AVX2....

Another midstate optimization. Centralize midstate prehash by doing it in stratum thread or when a miner thread returns from getwork and sharing the result with all miner threads. Previously each...

Some old algos have been found not to have proper stats reporting when using an old CPU (#392). Some will be fixed in v3.21.3 but there may be more remaining....

I have a theory for this apparent paradox. The Lyra2 optimization was intended to reduce memory access and targetted algos that use a larger lyra2 matrix. x25x uses a smaller...

Things are getting weird. I'mtrying to implement both functions where most algos can choose the code from the implementation from the current release and x25 can use the one from...

I can clearly define the problem but have no solution. v1 is the code from v3.11.1 where x25x and x22i are faster, allium, lyra2z etc are slower. v2 is the...

Major developments, but first some background. The original issue is due to data divergence when hashing Lyra2 2 way parallel. Lyra2 parallel AVX512 is hashed in 2 256 bit lanes....

GCC is starting to piss me off. It won't me implement both versions. When I saw slow results on x25x using v1 I put a printf in the v2 path...