ocaml-dns Add eio backend for ocaml-dns client

This PR adds support for eio backend for ocaml-dns client. At the moment it only adds tcp tansport protocol support. I intend to add tls or DNS over TLS support in the future once eio support for ocaml-tls is done.

[x] Transport module
- use Eio.Mutex around Queue calls ? (Not sure if it is needed yet)
[x] Integrate mirage-crypto-rng-eio (https://github.com/mirage/mirage-crypto/pull/155)
[x] ohost executable exercising dns-client-eio package (https://github.com/bikallem/ocaml-dns/blob/eio/eio/client/ohost.ml)

This PR has a few unreleased packages dependencies:

[x] A new version of eio with the Mutex module (https://github.com/ocaml-multicore/eio/blob/main/lib_eio/eio.mli#L35)
[x] mirage-crypto-rng-eio (https://github.com/mirage/mirage-crypto/pull/155)

At the moment eio doesn't support monotonic clock, so we still use mtime. It may be preferable to use monotonic clock support from eio once support for it lands in eio (https://github.com/ocaml-multicore/eio/issues/229). However, that should be a future PR.

Jun 23 '22 10:06 bikallem

I have rebased/updated the PR to address reviewer feedback.

(Was the idea of the first class module to force users to initialise mirage-crypto before using the library? If so, we could probably fix this more easily by having Mirage_crypto_rng_eio.run return a token and passing that to create_from_env. Passing first-class modules around tends to get awkward.)

Indeed, the idea of the API was to maintain the invariant that that consumers of the library to be able to use the module only when run under Mirage_crypto_rng_eio.run. The current API aims to maintain that invariant as well as be convenient such that one doesn't have to manually run (although they can if they so desire) the mirage crypto rng generator.

Oct 04 '22 14:10 bikallem

Hmmm ... not sure why the ci is failing. Can anyone help?

Oct 04 '22 14:10 bikallem

It says:

lru.0.3.0: Requires ocaml >= 4.03.0 & < 5.0

Oct 04 '22 15:10 talex5

I'm not particularly a fan of passing first-class modules around, and don't think there's a good reason here.

I wonder what the "EIO" and "RANDOM" story is (if there's any):

I have seen some work on mirage-crypto-rng-eio, so you can initialize and feed the Fortuna RNG;
There has been as well some support for getrandom (but seeing the discussion of https://github.com/ocaml-multicore/eio/pull/344 I'm not sure whether this can be safely used) -- this is exposed as "secure_random" by the "StdEnv" AFAICT (though I'm not sure what else is in this "StdEnv" and why);
And OCaml in 5.0 has support for getentropy to feed its internal RNG (Random module).

So, what is the plan for EIO and RANDOM? How and who should decide where to take random numbers from? Is this all up to the user (and do they have a unified interface)? In DNS, we don't need cryptographically secure random, something that is not predictable is sufficient (it's used mainly for the ID anyways, to avoid spoofing of DNS packets (see https://en.wikipedia.org/wiki/DNS_spoofing if interested in details)).

Oct 25 '22 20:10 hannesm

I'm not particularly a fan of passing first-class modules around, and don't think there's a good reason here.

Indeed. I removed it in my branch (see https://github.com/mirage/ocaml-dns/pull/312#pullrequestreview-1088918946).

(but seeing the discussion of https://github.com/ocaml-multicore/eio/pull/344 I'm not sure whether this can be safely used)

The PR is just trying to make it harder to misuse the API. mirage-crypto-rng-eio already uses it correctly (I'm not aware of anyone using it incorrectly).

In DNS, we don't need cryptographically secure random, something that is not predictable is sufficient

I assume it's just copying the existing lwt code, which uses mirage-crypto-rng.lwt. Would be good to reduce the dependencies if it's not needed :-)

Oct 27 '22 12:10 talex5

Trying once more. In eio, do you have a user guide / approach to randomness?

I can see three different things going on here:

the OCaml 5 runtime (Stdlib.Random -- which is a LXM: Better Splittable Pseudorandom Number Generators (and Almost as Fast); seeding is done via getentropy() if available)
the eio providing a "secure_random" in "StdEnv", which calls out to getrandom()
mirage-crypto-rng-eio calling out to rdrand/rdseed and getrandom/getentropy (periodic task) feeding a Fortuna RNG

Now, my questions are:

is there a unified interface between these implementations?
is there an eio-recommended usage of either of these?
is the approach to have one syscall (getentropy/getrandom) each time some random is needed too expensive?

Maybe this PR here is the wrong context for these questions and considerations, but for me it is important to understand how "eio" should be used in order to avoid having to back out such changes.

Oct 27 '22 14:10 hannesm

I have now removed the use of first class modules, the ohost.exe included in the PR don't seem to be working after latest rebase and commit, so I debugging through the issue.

EDIT: Updating the default nameservers to the same as unix client did the trick. It is working now.

Oct 27 '22 14:10 bikallem

Trying once more. In eio, do you have a user guide / approach to randomness?

I can see three different things going on here:
* the OCaml 5 runtime (Stdlib.Random -- which is a LXM: Better Splittable Pseudorandom Number
  Generators (and Almost as Fast); seeding is done via `getentropy()` if available)

* the eio providing a "secure_random" in "StdEnv", which calls out to `getrandom()`

* mirage-crypto-rng-eio calling out to rdrand/rdseed and getrandom/getentropy (periodic task) feeding a Fortuna RNG
Now, my questions are:
* is there a unified interface between these implementations?

* is there an eio-recommended usage of either of these?

* is the approach to have one syscall (getentropy/getrandom) each time some random is needed too expensive?
Maybe this PR here is the wrong context for these questions and considerations, but for me it is important to understand how "eio" should be used in order to avoid having to back out such changes.

All valid and good questions and for which I do not have quick answers. Perhaps they may get a better traction as an issue in the eio repo?

Oct 27 '22 14:10 bikallem

Trying once more. In eio, do you have a user guide / approach to randomness?

No. At the moment, Eio is just exposing what the OS provides: e.g. Linux provides getrandom(2) and Eio provides an OCaml API for using that.

[ mirage-crypto, Eio's secure_random, stdlib random ]

is there a unified interface between these implementations?

Not at the moment. Eio's secure_random is just an Eio.Flow.source (byte stream). It would be possible to access the other two with the same interface if desired, though.

is there an eio-recommended usage of either of these?

I was imagining that:

If you want secure random numbers, you use mirage-crypto (which uses Eio to call getrandom as needed).
If you don't, you use Stdlib.Random (possibly seeded using Eio, but probably using OCaml's built-in seeding).

is the approach to have one syscall (getentropy/getrandom) each time some random is needed too expensive?

I was expecting secure_random it to be used only for seeding, but depending on the requirements I guess it could be fine to use it directly.

The API docs should probably mention that it just wraps the OS syscall and may not be particularly fast. Or perhaps users could wrap it with Eio.Buf_read to read larger amounts of random data in chunks.

You know a lot more about RNGs than me - how would you like it to work?

Oct 27 '22 17:10 talex5

I feel two main points that Hannes is concerned are: a) Randomness b) The queue that tries to work-around the lack of happy-meatballs

I think a is a non-issue, as @hannesm said, we just want random 16bit ids that do not need to be cryptographically secure. In EIO, as @talex5 mentioned, we provide a secure_random where you request cryptographically secure entropy, so this port could use secure_random and use it to retrieve "more often than not" unique 16bit ids. The other alternative is to call the stdlib Random, which for this case would be enough. I agree with @talex5, if you deem that retrieving that entropy is too expensive, that's on you to pull larger chunks and deplete them when you need. EIO is not keeping state, we're just tapping into whatever the OS thinks is "good entropy". Personally I'd use it directly until I see 1 brazilion packets per second being hindered by it. Worth mentioning that getentropy/getrandom guarantee to not block on <= 256 bytes requests (it's actually 4k for getrandom, the manpage is lying).

Maybe we should attack b first, so that this port doesn't look too especial. In regards to functionality, the EIO port should "look a lot" like the Lwt in terms of structure and primitives, and perhaps if we work on happy-balls first this becomes a bit more of a 1:1 port. I had a quick look at the Lwt so forgive me for any misunderstandings.

Oct 31 '22 16:10 haesbaert

@talex5 asked

You know a lot more about RNGs than me - how would you like it to work?

Well, cryptographically secure random numbers. If it is too expensive to call getentropy (or getrandom) for each, there should be an RNG.

The issue with RNG is e.g. fork() (which means that the RNG internal state is the same between parent and child process). Is there an issue with multicore (i.e. is there a single RNG for all domans, or one RNG per domain)?

At the moment, I think that mirage-crypo-rng-{lwt,async,eio,unix} are overengineered (of course depends on your use case, and there's quite some history connected to these implementations). Instead, less trouble would be to skip the Fortuna and always use getentropy. For MirageOS this is surely an issue, since there we need to collect entropy. Of course this should be benchmarked, and e.g. an application that generates big RSA keys all the time will be worse off than e.g. a https server.

Also, certainly there's an argument for trying to share lots of code between contexts since rarely used code has bugs that are hard to find. The path forward would be to figure out what common applications are and which RNG is fine is such a setting. Could EIO, already having a concept of "secure_getenv" in its "environment", have that pluggable (or is it already) -- i.e. some application may decide to use mirage-crypto-rng-eio as the "secure_getenv" (by registering that somehow), and magically there won't be many getrandom() calls?

NB this does not work in lwt/async since they don't have such an environment. NB the OCaml 4 Random was not seeded as the OCaml 5 one, and the implementation is also different (but takes a lot of effort to split and contain different internal states if split).

Nov 09 '22 17:11 hannesm

The issue with RNG is e.g. fork() (which means that the RNG internal state is the same between parent and child process). Is there an issue with multicore (i.e. is there a single RNG for all domans, or one RNG per domain)?

random.mli says "With multiple domains, each domain has its own generator that evolves independently of the generators of other domains. When a domain is created, its generator is initialized by splitting the state of the generator associated with the parent domain."

Instead, less trouble would be to skip the Fortuna and always use getentropy

OK, that makes sense.

Could EIO, already having a concept of "secure_getenv" in its "environment", have that pluggable (or is it already) -- i.e. some application may decide to use mirage-crypto-rng-eio as the "secure_getenv" (by registering that somehow), and magically there won't be many getrandom() calls?

Yes, the interface is just a source of bytes, and you can provide any alternative implementation. For example, you would normally have:

let main ~secure_random = ...

let () =
  Eio_main.run @@ fun env ->
  let secure_random = Eio.Stdenv.secure_random env in
  main ~secure_random

I would expect that, for a unikernel, the mirage tool would generate the code that calls main, and the tool would decide which RNG to use. e.g.

let () =
  Eio_xen.run @@ fun xen ->
  let secure_random = Mirage_crypto_rng_eio.create ... in
  main ~secure_random

Alternatively, the mirage tool could create its own env, but that seems unnecessary.

Nov 15 '22 11:11 talex5

Needs https://github.com/mirleft/ocaml-tls/pull/458 to operate correctly.

Dec 12 '22 10:12 bikallem

So it seems we're a bit stuck on this PR:

(a) there's some lack of a "random" story for eio (and it seems nobody is pushing that forward (no, it won't be me))
(b) concurrent reads and writes on TLS (further discussed in https://github.com/mirleft/ocaml-tls/issues/464, still without a solution)

How to move forward here? I unlikely have time before Q3 2023 to look into "eio", but I suspect you'd like to have "something merged" sooner. But I'm still overwhelmed by the complexity of eio and don't feel confident to grasp the semantics of what is happening where (and why concurrently etc.).

Feb 15 '23 19:02 hannesm

@hannesm This PR is now ready. It doesn't have the TLS issue anymore.

(a) there's some lack of a "random" story for eio (and it seems nobody is pushing that forward (no, it won't be me))

The issue is separate from dns-client-eio and should be handled in the eio repo.

Mar 06 '23 09:03 bikallem

I take a long look about this PR and it seems that DNS request by another domain seems not an option with the current proposed design in this PR - at least, the use of "mutable" without protection makes me think there could be problems.

The reason I point to this specific usage is that it seems to me that the design of happy-eyeballs (the core package, not the happy-eyeballs-lwt) is to be a background task that maintains a "pool" of TCP/IP connections to nameservers and can resolve a DNS request from the user. It is therefore "legitimate" to think that one domain would take care of such a task in the background while the user application could interact with it from another domain to resolve domain-name.

Jun 30 '23 13:06 dinosaure

I take a long look about this PR and it seems that DNS request by another domain seems not an option with the current proposed design in this PR - at least, the use of "mutable" without protection makes me think there could be problems.

In current eio, IIUIC resource/connection/switch created in one domain can't be moved/used in another domain. Therefore, even if happy eyeballs is able to create connections, I am not sure if those connections can be used/shared in an ad-hoc manner by the domains. However, this does not preclude having multiple/parallel dns client requests via OCaml domains - which the current design offers/enables.

Jul 06 '23 21:07 bikallem

ocaml-dns ocaml-dns copied to clipboard

Add eio backend for ocaml-dns client

ocaml-dns
ocaml-dns copied to clipboard