scapy
scapy copied to clipboard
File descriptors are not explicitly closed in TLS automata (which breaks with pypy)
Brief description
We use TLS automata from scapy to interact with TLS servers, generating a big number of connections (each with a fresh automaton), and we observe that file descriptors can accumulate, especially with pypy where the number can reach several thousands.
Scapy version
scapy-2.4.5rc1.dev317
Python version
3.10-python and 3.9-pypy
Operating system
Debian 5.10.127-1 (2022-06-30)
Additional environment information
We are using scapy to infer TLS state machines using the L* algorithm (see our tool if you are interested).
For our experiments, we tried switching from python to pypy, but we encountered a problem with file descriptors which were not explicitly closed (and since pypy does not do aggressive garbage collection, to-be-closed fds can last a rather long time, which is problematic for us). This is a known problem with pypy, but we do not know how to fix it properly in scapy-tls. For your information, it seems to be related to a recent issue where test sockets were not properly closed, and there was an old bug which seems to also be related.
We are at your disposal for more tests, and thank you in advance for your help!
How to reproduce
We join a simple reproducer using TLS automata. The idea is to run n TLS connections in a row, while monitoring in /proc/PID/fd the number of open file descriptors. We observe a big number of fds (mostly pipes, but also sockets), which take time to be closed. We believe this comes from the implementation of automata in scapy, but this might also be TLS-specific?
from argparse import ArgumentParser
from scapy.layers.tls.automaton_cli import TLSClientAutomaton
def run(host, port, tls_version, n):
for _ in range(n):
tls = TLSClientAutomaton(
server=host,
dport=port,
version=tls_version,
data="quit"
)
tls.run()
run("SERVERNAME", 443, "tls12", 1000)
For n=1000, the maximum reached with python is 185 open fds. With pypy, it seems several of the fds are never closed, leading to a growing stock of stale fds. On a first experiment, we reached 955 (and ended at 765). On a second experiment, we reached 1033... which led to an error in select (since select only supports 1024 fds).
Actual result
The number of open file descriptors grows steadily with pypy. This can lead to resource exhaustion or errors with select.
Expected result
We expect the number of open file descriptors to remain low during all the experiment, which is what we observe with python.
Related resources
- https://doc.pypy.org/en/latest/cpython_differences.html#differences-related-to-garbage-collection-strategies
- https://github.com/secdev/scapy/issues/3677
- https://github.com/secdev/scapy/issues/1831