openssl icon indicating copy to clipboard operation
openssl copied to clipboard

QUIC Event Loop Design

Open mattcaswell opened this issue 3 years ago • 28 comments

Early stage design document identifying some possible approaches to how the QUIC event loop might work for input into OTC discussions.

mattcaswell avatar Dec 03 '21 11:12 mattcaswell

Hint: to read the document and see the images inline, from the "Files changed" tab above click the 3 dots "..." in the top right hand corder of the quic-event-loop.md document and select "View file" from the menu.

mattcaswell avatar Dec 03 '21 11:12 mattcaswell

Interesting document however I see some big gaps in the dissertation here.

You should really start from the design of current applications and think how OpenSSL would be used by these applications, otherwise you may create something that will not work, or will be extremely hard to integrate. Once you list how a bunch of applications currently using OpenSSL deal with their event loop and how you see integration with your new API you will have a more productive discussion and evaluation ground. (I suggest at the very least a couple of HTTP servers, one or two http clients like curl, and maybe something like Samba given that Microsoft is implementing QUIC for SMB, so SMB servers may well end up needing to use QUIC for compatibility or performance, or both.

Specifically I see that most options hardly work with single process async applications, and there are many like that, as async programming has been quite a fad in some circles for a while. For those applications the only option, generally, is that they have their own event loop that must be the one listening on sockets.

Another extremely important consideration that I see missing is any discussion about performance. One of the key reasons for using QUIC is performance, if performance is not important there is really no much point in using QUIC, multiple, or a single TLS connection will be preferable. This again means the application must have as much control as possible, and the overhead introduced by the OpenSSL layer really needs to be minimal and under application control. If that is not the case the OpenSSL implementation will not be useful.

Finally, and this is just a nit, nobody in their right mind uses select() these days, I suggest using another name, even poll() is considered truly obsolete on linux, although that may be the only option on some other architectures :-)

simo5 avatar Dec 03 '21 14:12 simo5

You should really start from the design of current applications and think how OpenSSL would be used by these applications, otherwise you may create something that will not work, or will be extremely hard to integrate. Once you list how a bunch of applications currently using OpenSSL deal with their event loop and how you see integration with your new API you will have a more productive discussion and evaluation ground. (I suggest at the very least a couple of HTTP servers, one or two http clients like curl, and maybe something like Samba given that Microsoft is implementing QUIC for SMB, so SMB servers may well end up needing to use QUIC for compatibility or performance, or both.

Good suggestion. I'll give this some thought. We have actually done some work to identify the different kinds of applications that we want to support. Applying that to what it means for the event loop might be useful.

Specifically I see that most options hardly work with single process async applications, and there are many like that, as async programming has been quite a fad in some circles for a while. For those applications the only option, generally, is that they have their own event loop that must be the one listening on sockets.

It seems option 1 would be the way to go for these applications. I can capture this somewhere in the text.

This again means the application must have as much control as possible, and the overhead introduced by the OpenSSL layer really needs to be minimal and under application control. If that is not the case the OpenSSL implementation will not be useful.

Again - I can capture this in the document.

mattcaswell avatar Dec 03 '21 14:12 mattcaswell

Is it possible to give the images an other name than just an image?

kroeckx avatar Dec 03 '21 18:12 kroeckx

Is it possible to give the images an other name than just an image?

Any particular reason why? Those images aren't intended to viewed independent of the document as such.

You should be able to review looking at the marked up version - i.e. https://github.com/openssl/openssl/blob/5de8dfab5037f23283150b593076f47d2b2e2d84/doc/designs/quic-design/quic-event-loop.md which is how things are intended to be looked at.

t-j-h avatar Dec 07 '21 05:12 t-j-h

The “application selects” API is the most important one. Event-loop implementations are highly system-specific and performance-critical, and OpenSSL does not want to be implementing them.

DemiMarie avatar Dec 08 '21 22:12 DemiMarie

The “application selects” API is the most important one. Event-loop implementations are highly system-specific and performance-critical, and OpenSSL does not want to be implementing them.

I agree. This seems to be the strong message coming through from most people that look at this.

mattcaswell avatar Dec 09 '21 11:12 mattcaswell

The “application selects” API is the most important one. Event-loop implementations are highly system-specific and performance-critical, and OpenSSL does not want to be implementing them.

I agree. This seems to be the strong message coming through from most people that look at this.

Unless I am mistaken, what most people (myself included) want is for #8797 to be merged. I honestly do not understand why the Committee decided to not do that.

DemiMarie avatar Dec 09 '21 11:12 DemiMarie

Unless I am mistaken, what most people (myself included) want is for #8797 to be merged. I honestly do not understand why the Committee decided to not do that.

@DemiMarie refer to this blog post for answers to that question. Comments for this PR should be focused on the details in the design itself in the PR.

t-j-h avatar Dec 09 '21 11:12 t-j-h

Any sort of event loop that OpenSSL tried to support would need to cover a huge variety of platforms. Linux epoll and io_uring, Windows IOCP and IoRing, and *BSD kqueue are just some of what would need to be supported. This doesn’t even consider those who are using kernel-bypass networking. There is absolutely no way that OpenSSL can try to abstract over all of these, nor should it even attempt to.

The two I/O included QUIC implementations I am aware of are MsQuic and quinn. Neither are suitable for integrating into something like libp2p or curl. In fact, quinn is just a wrapper around the lower-level quinn-proto crate, which contains the actual state machine. And quinn-proto is suitable for use in libp2p.

If OpenSSL is going to provide a QUIC implementation, it should be as bare-bones as possible, and compatible with as many uses as possible. That means that OpenSSL should just provide a raw state machine that does not own any buffers, does no I/O, and does not even try to dispatch packets based on their connection ID. Cloudflare’s quiche is one example of such an API from what I can tell.

DemiMarie avatar Dec 09 '21 11:12 DemiMarie

Any sort of event loop that OpenSSL tried to support would need to cover a huge variety of platforms. Linux epoll and io_uring, Windows IOCP and IoRing, and *BSD kqueue are just some of what would need to be supported. This doesn’t even consider those who are using kernel-bypass networking. There is absolutely no way that OpenSSL can try to abstract over all of these, nor should it even attempt to.

I agree.

does not own any buffers

I'd like to understand what motivates this statement. Possibly this is related to wanting to use features such as "sendmmsg" or GSO or similar, or possibly just a desire to reduce copying within OpenSSL (e.g. "zero copy"). Or something else?

does no I/O

I assume what is actually important here is the ability to override any default I/O behaviour e.g. in the same was as OpenSSL currently has a BIO abstraction - something similar will be needed. There can be default out-of-the-box I/O implementations, but applications need to be able to override this if they want to.

does not even try to dispatch packets based on their connection ID

Why? Is it acceptable to have default dispatching capability that can be overridden if needed?

mattcaswell avatar Dec 09 '21 12:12 mattcaswell

does not own any buffers

I'd like to understand what motivates this statement. Possibly this is related to wanting to use features such as "sendmmsg" or GSO or similar, or possibly just a desire to reduce copying within OpenSSL (e.g. "zero copy"). Or something else?

It’s all of these, plus others. For instance, packet buffers might need to be allocated out of DMA-capable (pinned) memory, or from a buffer registered with io_uring. If one also desires to perform in-place encryption/decryption, OpenSSL cannot be responsible for performing any memory allocation in the data path.

does no I/O

I assume what is actually important here is the ability to override any default I/O behaviour e.g. in the same was as OpenSSL currently has a BIO abstraction - something similar will be needed. There can be default out-of-the-box I/O implementations, but applications need to be able to override this if they want to.

The BIO abstraction is not suitable here, as BIO APIs operate on memory not owned by the BIO. This forces the BIO to make a copy internally. See the discussion about AsyncRead (slower) vs AsyncBufRead (faster) in the Rust community for background.

does not even try to dispatch packets based on their connection ID

Why? Is it acceptable to have default dispatching capability that can be overridden if needed?

Load-balancing in shared-nothing designs requires that packets be dispatched by the user. The NIC uses RSS to steer traffic based on the 5-tuple, but some packets might be steered incorrectly because the client’s IP address has changed. Such traffic needs to be explicitly forwarded to the appropriate thread.

DemiMarie avatar Dec 10 '21 00:12 DemiMarie

I assume that the images in this PR are based on the text files. I would like to see build instructions of how to generate the images, preferably using a make target. I assume they are generated with free software and that that software doesn't place restrictions on the images.

kroeckx avatar Dec 11 '21 16:12 kroeckx

I assume that the images in this PR are based on the text files. I would like to see build instructions of how to generate the images, preferably using a make target. I assume they are generated with free software and that that software doesn't place restrictions on the images.

Please read the https://github.com/openssl/openssl/blob/5de8dfab5037f23283150b593076f47d2b2e2d84/doc/designs/quic-design/seqdiags/README.md on how the images are generated. Unfortunately no free software tool to generate the images. As for the images restrictions citing the site:

Are diagrams/scripts created using SequenceDiagram.org subject to any license?

No license is imposed by SequenceDiagram.org on the generated output. However, like with all images containing text, the fonts used might. The default font used in diagrams is the default sans-serif font selected by your browser. You can specify a different font using the fontfamily keyword, see help for more information. See LICENSE for details.

t8m avatar Dec 13 '21 10:12 t8m

I'm not sure if this is the correct issue to raise it in, but I don't see any requirements for how to deal with the UDP sockets itself. This includes things like are we responsible for sending the data, or the application? Do we have 1 socket for all connections, or one per connection? Do we need to support binding to an interface, or should the application do that for us? Do we have a way to get the connection details? If at some point multipath support is added, will the API support it?

kroeckx avatar Dec 24 '21 15:12 kroeckx

This PR is in a state where it requires action by @openssl/committers but the last update was 30 days ago

openssl-machine avatar Jan 24 '22 00:01 openssl-machine

I honestly recommend that OpenSSL use an external library for QUIC support. Cloudflare’s quiche is my preferred choice, as it is written in Rust and thus (hopefully) memory-safe.

DemiMarie avatar Jan 24 '22 00:01 DemiMarie

I would recommend that if a decision couldn't be made then to merge the QUIC fork and make any changes/updates to it.

breisig avatar Feb 23 '22 19:02 breisig

select or poll is very good for people in the last century ^_^, is this a cross platform design which considering kqueue, iocp, epoll and ...

crasyangel avatar Feb 28 '22 07:02 crasyangel

This PR is in a state where it requires action by @openssl/committers but the last update was 30 days ago

openssl-machine avatar Mar 31 '22 00:03 openssl-machine

This PR is in a state where it requires action by @openssl/committers but the last update was 61 days ago

openssl-machine avatar May 01 '22 00:05 openssl-machine

This PR is in a state where it requires action by @openssl/committers but the last update was 92 days ago

openssl-machine avatar Jun 01 '22 00:06 openssl-machine

This PR is in a state where it requires action by @openssl/committers but the last update was 30 days ago

openssl-machine avatar Jul 21 '22 00:07 openssl-machine

@openssl/contractors second review?

paulidale avatar Jul 29 '22 02:07 paulidale

This PR is in a state where it requires action by @openssl/committers but the last update was 30 days ago

openssl-machine avatar Sep 01 '22 00:09 openssl-machine

This PR is in a state where it requires action by @openssl/committers but the last update was 61 days ago

openssl-machine avatar Oct 02 '22 00:10 openssl-machine

This PR is in a state where it requires action by @openssl/committers but the last update was 92 days ago

openssl-machine avatar Nov 02 '22 00:11 openssl-machine

I still think that by far the best solution is to rely on a third-party library for QUIC.

DemiMarie avatar Nov 02 '22 01:11 DemiMarie

This PR is in a state where it requires action by @openssl/committers but the last update was 30 days ago

openssl-machine avatar Dec 03 '22 00:12 openssl-machine

Any updates for merging?

HLFH avatar Dec 15 '22 18:12 HLFH