cats-effect icon indicating copy to clipboard operation
cats-effect copied to clipboard

Make `UUIDGen` more performant

Open armanbilge opened this issue 3 years ago • 19 comments

As of https://github.com/typelevel/cats-effect/pull/2880 it delegates to Sync#blocking.

Some ideas in https://github.com/typelevel/cats-effect/issues/2865#issuecomment-1062567227 to improve the status quo:

While UUID.randomUUID is blocking, we can do better:

  • Try to create a SecureRandom.getInstance("NativePRNGNonBlocking"). "NativePRNG" appears to be equivalent on nextBytes on Unix, but that's platform specific.
  • Until Java 9, there's still a mutex, but a pool fixes that.

armanbilge avatar Mar 18 '22 01:03 armanbilge

http4s/http4s#6138 is another hidden instance of this problem. Maybe what we really need is a way of safely and efficiently generating random bytes. UUIDGen, and this http4s use case, could both build on that.

rossabaker avatar Mar 20 '22 04:03 rossabaker

Yes, I've wondered if we need a SecureRandom[F] type class that extends Random[F]. But btw, at what point does this start scope-creeping outside of CE3? If we get a SecureRandom it will need a JS implementation, which gets into bobcats territory with different implementations for Node.js and browsers and a CI matrix spanning all of them.

armanbilge avatar Mar 20 '22 05:03 armanbilge

Ross has demoed his proposal in https://github.com/http4s/http4s/pull/6165.

armanbilge avatar Mar 20 '22 23:03 armanbilge

I rather like Ross' proposal.

djspiewak avatar Mar 21 '22 00:03 djspiewak

I reformed it a bit to add the pool. This is a win when we get caught in a mutex, which we always will in Java 8. In Java 9, if we have a non-blocking but threadsafe instance, we probably don't want that pool. But detecting thread safety is super gross.

rossabaker avatar Mar 21 '22 03:03 rossabaker

Would SecureRandom have any additional methods, or just a marker trait that it's, like, secure and stuff?

I can imagine a marker trait being useful, so a random client can express that it needs something stronger.

rossabaker avatar Mar 24 '22 16:03 rossabaker

I was thinking marker trait. I see the JDK SecureRandom adds a couple additional methods like getAlgorithm, no idea how useful those are.

https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/security/SecureRandom.html

armanbilge avatar Mar 24 '22 16:03 armanbilge

I have found getAlgorithm nice for debugging. The next would be great, because I already needed bits instead of bytes, and had to go from bytes back to bits. But Java's not going to give us that.

rossabaker avatar Mar 24 '22 16:03 rossabaker

Btw, another good reason to implement UUIDGen in terms of our own SecureRandom is that at least on Scala.js, the UUID.randomUUID() method is not cryptographically secure. Actually that seems like a pretty serious issue...

https://github.com/scala-js/scala-js/blob/058532aa8c504b76431b40e3e1b51b2cdef87643/javalib/src/main/scala/java/util/UUID.scala#L139

armanbilge avatar Mar 25 '22 22:03 armanbilge

Btw, another good reason to implement UUIDGen in terms of our own SecureRandom is that at least on Scala.js, the UUID.randomUUID() method is not cryptographically secure. Actually that seems like a pretty serious issue...

Uh yes, that's a relatively serious issue.

djspiewak avatar Mar 25 '22 23:03 djspiewak

Hey!

I want to take this ticket, I'm just wondering what would be the final solution here. Should we have something like this?

SecureRandom[F].nextBytes(16)

Using our own implementation of SecureRandom and them reimplement the steps done on UUID.randomUUID() ?

public static UUID randomUUID() {
    SecureRandom ng = Holder.numberGenerator;

    byte[] randomBytes = new byte[16];
    ng.nextBytes(randomBytes);
    randomBytes[6]  &= 0x0f;  /* clear version        */
    randomBytes[6]  |= 0x40;  /* set to version 4     */
    randomBytes[8]  &= 0x3f;  /* clear variant        */
    randomBytes[8]  |= 0x80;  /* set to IETF variant  */
    return new UUID(randomBytes);
}

diogocanut avatar May 24 '23 03:05 diogocanut

Hi @diogocanut, great! Good question, I think there are two changes we need to make:

  1. Exactly like you say, implement our own UUIDGen[F] using SecureRandom[F].

  2. Optimize SecureRandom[F] itself. Currently it is blocking which is bad for performance, but it doesn't have to be. In this PR Ross demonstrates how we can attempt to acquire a non-blocking secure random implementation. We should port that strategy to Cats Effect. https://github.com/http4s/http4s/pull/6165

armanbilge avatar May 24 '23 03:05 armanbilge

Hm, I'm wondering if the NativePRNGNonBlocking part is still relevant? I'm trying to find some up to date info on it, and it seems to be solved even on Linux: https://lwn.net/Articles/884875/.

(Of course the other thing, JVM 8 using synchronized is certainly relevant on JVM 8.)

durban avatar Jun 10 '23 21:06 durban

Uhh, NativePRNGNonBlocking also seems to have a lock... See also https://bugs.openjdk.org/browse/JDK-8278371.

durban avatar Jun 11 '23 09:06 durban

Here's a potentially mad idea ... could we just implement our own secure random by reading from /dev/urandom? It's an ordinary file so we shouldn't need any JNI or native calls to use it. And even if we did still need some kind of locking, at least we could do it with Mutex so it's only fiber-blocking.

armanbilge avatar Jun 12 '23 04:06 armanbilge

I thought about that, but it seems like a can of worms (although if someone wants to open it, I definitely won't stop them):

  • What to do on exotic platforms, where there is no /dev/urandom (like... Windows)?
  • I believe /dev/urandom still happily returns non-random data shortly after boot (under certain circumstances). I'm not sure how "shortly" that happens (e.g., is it observable from a JVM at all), but it seems somewhat scary.
    • Btw, I have the same concern about NativePRNGNonBlocking.
  • As far as I know, the alternative /dev/random is fixed on modern Linux, but on older versions it blocks when it feels like it.

durban avatar Jun 13 '23 13:06 durban

  • What to do on exotic platforms

Fallback to the current scheme: use JDK SecureRandom with blocking(...)

  • I believe /dev/urandom still happily returns non-random data shortly

Yeah ... and I think you are right that NativePRNGNonBlocking would have similar issue.

  • As far as I know, the alternative /dev/random is fixed on modern Linux, but on older versions it blocks when it feels like it.

Yeah, this seems the most promising avenue, based on the article you linked.


In light of above, my best ideas are:

  1. check for modern linux and use /dev/random, otherwise fallback to blocking with JDK SecureRandom

  2. give up and just use blocking with JDK SecureRandom 🤷

armanbilge avatar Jun 13 '23 14:06 armanbilge

Oh, right, here it's possible to just fall back to blocking... that sounds good to me. A possible (but somewhat convoluted, maybe even fragile) third option:

  1. When creating the SecureRandom, a. read a little data from /dev/random (in blocking): this will make sure the kernel RNG is initialized b. write this data into /dev/urandom (in delay): this will make sure that urandom is also initialized, even if it's using a different RNG from /dev/random (I think older Linux might do this?) c. read from /dev/urandom for nextBytes and similar (in delay): it should be safe after b. d. (if any of a./b. fails, fall back to a SecureRandom with blocking)

I think this option has the advantage of also working on older Linux.

(Fallback could be smarter, e.g., detect if we have a sufficiently modern OS that the dance with reading-writing is not necessary, ...)

durban avatar Jun 14 '23 23:06 durban

Also, there is a JDK built-in SecureRandom called Windows-PRNG, which uses a Windows API call to get random numbers. Which sounds good, but I could not find any info on whether that call is blocking.

durban avatar Jun 14 '23 23:06 durban