pyopenssl
pyopenssl copied to clipboard
High CPU and bandwidth usage with small send()s in OpenSSL.SSL.Connection
Consider the following program:
import sys
from twisted.internet.protocol import Protocol
from twisted.internet.ssl import CertificateOptions, PrivateCertificate
from twisted.protocols.loopback import loopbackTCP
from twisted.internet.task import react
from twisted.internet import reactor
class Client(Protocol):
def connectionMade(self):
self.count = 1000
self.transport.startTLS(CertificateOptions(verify=False, requireCertificate=False))
assert sys.argv[1] in ("small", "big")
self.next_iteration()
def next_iteration(self):
self.count -= 1
if sys.argv[1] == "small":
for i in range(1_000):
self.transport.write(b"X")
else:
self.transport.write(b"X" * 1000)
if self.count == 0:
self.transport.loseConnection()
else:
reactor.callLater(0, self.next_iteration)
class Server(Protocol):
received = 0
def connectionMade(self):
with open("server.pem", "rb") as f:
options = PrivateCertificate.loadPEM(f.read()).options()
self.transport.startTLS(options)
def dataReceived(self, data):
self.received += len(data)
def connectionLost(self, reason):
assert self.received == 1_000_000, self.received
react(lambda reactor: loopbackTCP(Server(), Client(), noisy=True))
You can run it with lots of small writes, or with bigger writes. Here's the CPU usage:
$ time python tls_benchmark.py big
real 0m0.248s
user 0m0.144s
sys 0m0.039s
$ time python tls_benchmark.py small
real 0m3.312s
user 0m3.217s
sys 0m0.038s
I also measured bandwidth usage (bytes received in the TCP layer of Twisted) of the two options:
big: RECEIVED TOTAL OF 1022417 bytes
small: RECEIVED TOTAL OF 23000417 bytes
Discussion
Connection.send() returns how many bytes were successfully written. And pyOpenSSL sets SSL_MODE_ENABLE_PARTIAL_WRITE. This means future send() should not be called with same bytes, but rather those past whatever the result of send() is. Initially I thought this was part of the issue, but disabling it doesn't really make a difference.
So to solve this, either pyOpenSSL has to grow a buffering layer, or this would have to be something handled in libraries using pyOpenSSL. If the latter, the performance and bandwidth implications of small send()/write() should be documented, at least.
Oh, and I should add that profiling indicates the bulk of the additional time is inside OpenSSL code, it's not about Python function overhead.
I fixed this in Twisted with a buffering layer, but it may well be an issue in other frameworks or applications.