julia icon indicating copy to clipboard operation
julia copied to clipboard

Implement faster thread local rng for scheduler

Open gbaraldi opened this issue 1 year ago • 5 comments

Implement optimal uniform random number generator using the method proposed in https://github.com/swiftlang/swift/pull/39143 based on OpenSSL's implementation of it in https://github.com/openssl/openssl/blob/1d2cbd9b5a126189d5e9bc78a3bdb9709427d02b/crypto/rand/rand_uniform.c#L13-L99

This PR also fixes some bugs found while developing it. This is a replacement for https://github.com/JuliaLang/julia/pull/50203 and fixes the issues found by @IanButterworth with both rngs

C rng image

New scheduler rng image

~On my benchmarks the julia implementation seems to be almost 50% faster than the current implementation.~ With oscars suggestion of removing the debiasing this is now almost 5x faster than the original implementation. And almost fully branchless

We might want to backport the two previous commits since they technically fix bugs.

gbaraldi avatar Aug 15 '24 17:08 gbaraldi

Hmm, this seems to have made i686 very unhappy. Ok the issue is that pointer load. 32 bit has different alignment which is what is blowing this up

gbaraldi avatar Aug 15 '24 20:08 gbaraldi

We might want to backport the two previous commits since they technically fix bugs.

Sounds like that should be a separate PR

giordano avatar Aug 16 '24 00:08 giordano

Potentially naive question, but do we have a good benchmark to demonstrate that a faster scheduler rng actually speeds up scheduling?

I ask because in my testing of https://github.com/JuliaLang/julia/pull/50203 I saw the surprising behavior that the threaded fib got slower the faster the rng used (with no changes to allocations)

function fib(n::Int)
    n < 2 && return n
    t = Threads.@spawn fib(n - 2)
    return fib(n - 1) + fetch(t)
end

IanButterworth avatar Aug 16 '24 14:08 IanButterworth

I don't think it will affect it too too much. Because if the rand call is super hot then things aren't going so well elsewhere :)

gbaraldi avatar Aug 16 '24 14:08 gbaraldi

There are already merge conflicts

giordano avatar Aug 27 '24 18:08 giordano

@nanosoldier runtests(ALL, vs = ":master")

vchuravy avatar Aug 29 '24 20:08 vchuravy

The package evaluation job you requested has completed - possible new issues were detected. The full report is available.

nanosoldier avatar Sep 01 '24 14:09 nanosoldier

Do we need to run benchmarks, too?

giordano avatar Sep 01 '24 15:09 giordano

I don't think any benchmark is gonna show this. At least no current one. Calling the function is 5x faster that's all I can say

gbaraldi avatar Sep 03 '24 14:09 gbaraldi