eio Fix getrandom(2)

Fix some mistakes and bad practices:

The Uring backend getrandom(2) was broken in commit 6f6b2eee, there is no guarantee getrandom(2) will fill the whole buffer in one go.
Make sure we either get all the requested bytes or fail hard. I've checked libuv and it already does the right thing (tm).
Fail as hard and as soon possible, don't let users think of catching an exception, don't propagate anything. If we can't get entropy, nothing should run. This is one of the only cases where a library should kill a program.
Don't truncate a buffer in Luv if it would overflow, fail hard (this could be improved to get in chunks).
While here, sanitize the includes.

I didn't know much about getrandom(2), and I'm surprised it was designed allowing short counts even though they based it off getentropy(2), which always does the right thing.

""" The behavior when a call to getrandom() that is blocked while reading from the urandom source is interrupted by a signal handler depends on the initialization state of the entropy buffer and on the request size, buflen. If the entropy is not yet initialized, then the call fails with the EINTR error. If the entropy pool has been initialized and the request size is large (buflen > 256), the call either succeeds, returning a partially filled buffer, or fails with the error EINTR. If the entropy pool has been initialized and the quest size is small (buflen <= 256), then getrandom() will not fail with EINTR. Instead, it will return all of the bytes that have been requested. """

Oct 11 '22 22:10 haesbaert

I don't see why the old behaviour was wrong. read_into is always allowed to do short reads; retrying is handled by e.g. read_exact.

Then the use should only be able to call read_exact and nothing else. Exporting an API under the name of secure which can return a short-count on randomized a buffer is a security hazard. Users will shoot themselves in the foot because they won't always check the return and make sure the whole buffer is filled. No one is ever interested in a partial random "secure" buffer.

I also disagree about killing the application. The application should be able to catch the exception and display it in a dialog box (or whatever), not be killed and lose the user's data.

Code will shift with time and errors will be lost, so in this case we should be super aggressive.

To get this wrong, an application would have to use read instead of read_exact, and then explicitly ignore the result of the read. Though possibly we should rename read to read_upto or read_single.

Yes, you can also call Eio_linux.Low_level.getrandom directly.

We can bet some money a not insignificant amount of people will do Flow.read secure_random buf |> ignore :smiley_cat: .

Oct 12 '22 10:10 haesbaert

Another note: It's also not a coincidence that libuv makes sure it can never return a short-count there

      int uv_random(uv_loop_t *loop, uv_random_t *req, void *buf, size_t  buflen,  unsigned
       int flags, uv_random_cb cb)
              Fill  buf  with  exactly buflen cryptographically strong random bytes acquired
              from the system CSPRNG. flags is reserved for future extension and  must  cur‐
              rently be 0.

              Short  reads  are  not possible. When less than buflen random bytes are avail‐
              able, a non-zero error value is returned or passed to the callback.

              The synchronous version may block indefinitely  when  not  enough  entropy  is
              available. The asynchronous version may not ever finish when the system is low

Oct 12 '22 10:10 haesbaert

Summarising this:

We provide a "secure-random" flow, which is a source of random bytes.
Users should take as many bytes from it as they need.
But, we don't trust users to be able to do this successfully.

We're worried that users will do something stupid, like trying to read from a flow by using Flow.read. Despite the name, this function is really exposing a low-level implementation detail of how flows work. The odoc-comment on Flow.read explains all this and points you at some alternative functions, but it's easy to believe someone would use it without reading that, and possible that they'd even explicitly ignore the result without reading it.

So, Flow.read is a very badly named and we should fix that. Any suggestions on a better name?

In addition to that, forcing the OS to fill the buffer probably won't have many bad side effects. The only ones I can think of are:

You can't show a progress indicator showing how much random data has been read.
If you are using Buf_read, it will wait until the entire buffer is filled, even if there was enough data already.
It may confuse people about how flows are supposed to work.

But I think a modern OS will just fill the buffer immediately anyway, rather than giving a few bytes and waiting, so (1) and (2) likely don't matter in this case.

Oct 14 '22 15:10 talex5

Summarising this:

We provide a "secure-random" flow, which is a source of random bytes.

Users should take as many bytes from it as they need.

But, we don't trust users to be able to do this successfully.

Agreed.

We're worried that users will do something stupid, like trying to read from a flow by using Flow.read. Despite the name, this function is really exposing a low-level implementation detail of how flows work. The odoc-comment on Flow.read explains all this and points you at some alternative functions, but it's easy to believe someone would use it without reading that, and possible that they'd even explicitly ignore the result without reading it.

Agreed.

So, Flow.read is a very badly named and we should fix that. Any suggestions on a better name?

I was thinking that maybe Flow.single_read, not Flow.read_single, it already obscures it enough ? Reality is most users will type Flow.r<wait for IDE completions>.

Also on non STREAM things, more often than not what you want is read/single_read, say a datagram socketpair (which I think you can't use recv(2) !?)

In addition to that, forcing the OS to fill the buffer probably won't have many bad side effects. The only ones I can think of are:

You can't show a progress indicator showing how much random data has been read.

If you are using Buf_read, it will wait until the entire buffer is filled, even if there was enough data already.

It may confuse people about how flows are supposed to work.

But I think a modern OS will just fill the buffer immediately anyway, rather than giving a few bytes and waiting, so (1) and (2) likely don't matter in this case.

I agree, I would just add that (1) is just a matter of asking smaller chunks.

Oct 14 '22 15:10 haesbaert

The rename of Flow.read is now merged, in #353.

The datagram problem is tracked in #342.

Oct 26 '22 10:10 talex5

Rebased and force-pushed, no changes.

Oct 26 '22 10:10 haesbaert

eio eio copied to clipboard

Fix getrandom(2)

eio
eio copied to clipboard