hyper
hyper copied to clipboard
Use 'sendfile' when possible.
One way nginx achieves great performance when serving static files over HTTP/1.1 is by using the Unix sendfile API, which essentially writes data from one FD to another in kernel space.
There's no reason we shouldn't do that ourselves when sending files over unencrypted HTTP/1.1. The kernel function was originally exposed in Python's OS module in Python 3.3, but a socket convenience wrapper was added in Python 3.5.
We should be able to, at the very least, backport that change to Python 3.3 and support it on all supported Python 3 versions. It would also be interesting to see what work is required to expose it in Python 2.7, at least in an optional way.
Obviously, this won't help Windows: sorry windows!
Note that sendfile means no SSL/TLS, while HTTP2 requires it (right?). Proposed idea can still be useful within a data center though.
HTTP/2 could not have used sendfile anyway, given that we need to split files up into multiple frames. Agreed that I should be clearer on when sendfile can be used:
- On Unix platforms
- On Python 3.3 or higher
- When using plaintext HTTP/1.1
Technically you could sendfile() segments of file in http2 as well, but you'd only get performance improvement if segments are large enough.
Finally, is this a blocking socket? Then you can't have timeouts, etc.
And if it's a non-blocking socket, you are limited by TCP buffer size (a few megs IIRC) and you'd have to call sendfile() many times to complete the op.
Moreover, select()/poll() will show socket as writeable as soon as a TCP window is acknowledged, so your select+sendfile loop will only write one blob at a time, where blob is between segment and rwin.
So... I guess what I'm trying to say is, make a simple script and test the underlying performance improvement assumption first.