cpython icon indicating copy to clipboard operation
cpython copied to clipboard

TLS/SSL asyncio leaks memory

Open rojamit opened this issue 2 years ago • 83 comments

Bug report

Bug description:

python3.9 without uvloop doesn't leaks memory (or noticeably smaller). python3.11+ (and others?) leaks memory A LOT under load (with or without uvloop) - up to +2gb per every test!

test commands: ab -n50000 -c15000 -r https://127.0.0.1/ (apt install apache2-utils)

import asyncio, ssl, uvloop

class HTTP(asyncio.Protocol):
    
    def __init__(self):
        self.transport = None
        
    def connection_made(self, transport):
        self.transport = transport

    def data_received(self, data):
        self.transport.write(
            b'HTTP/1.1 200 OK\r\nContent-Length: 0\r\nConnection: keep-alive\r\n\r\n'
        )
        self.transport.close()
        
def make_tls_context():
    ctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
    ctx.load_cert_chain('cert.crt', 'cert.key')
    return ctx

tls_context = make_tls_context()
loop = uvloop.new_event_loop()

async def start_server(loop):
    return await loop.create_server(
        HTTP, '127.0.0.1', 443, backlog=65535,
        ssl=tls_context)

loop.run_until_complete(start_server(loop))
loop.run_forever()

CPython versions tested on:

3.9, 3.11, 3.12

Operating systems tested on:

Debian Linux

Linked PRs

  • gh-113338
  • gh-114074
  • gh-114311
  • gh-114367
  • gh-114829
  • gh-114830
  • gh-115643

rojamit avatar Sep 18 '23 11:09 rojamit

Can confirm that the issue exists - Python 3.11, and uvloop 0.17.0, when trying to set up many client SSL connections that sometimes reconnects - even directly with asyncio.create_connection, and it's cleaned up not properly in some place.

Sometimes it leaked several megabytes per second for me.

I was able to track this by attaching to a running python process via gdb, and when it leaked, I put a catchpoint for brk syscalls, and examined 40 sequential breakpoint events, when all were related to /usr/lib/python3.10/ssl.py.

(gdb) py-bt
Traceback (most recent call first):
  File "/usr/lib/python3.10/ssl.py", line 917, in read
    v = self._sslobj.read(len)
  <built-in method buffer_updated of uvloop.loop.SSLProtocol object at remote 0xffff758f90c0>
  <built-in method run of _contextvars.Context object at remote 0xffff758c7340>
  File "/home/ubuntu/py-core/env/lib/python3.10/site-packages/aiohttp/web.py", line 544, in run_app
    loop.run_until_complete(main_task)
  File "/home/ubuntu/py-core/server_app.py", line 39, in <module>
    web.run_app(app_factory(), loop=LOOP, host=args.host, port=args.port)
(gdb)

Also, a gc.get_objects() count was not actually growing, that's why I suppose that this is related a lower-level native implementation.

selevit avatar Nov 17 '23 05:11 selevit

@1st1, @asvetlov, @graingert, @gvanrossum, @kumaraditya303 (as asyncio module experts)

arhadthedev avatar Nov 18 '23 19:11 arhadthedev

Unfortunately I am really bad at this type of low level stuff. :-( If you suspect a leak in the C accelerator, you can disable it and see if the leak goes away. (Or at least becomes less severe.)

gvanrossum avatar Nov 18 '23 20:11 gvanrossum

OTOH if the problem is in the SSL code, we need a different kind of expert.

gvanrossum avatar Nov 18 '23 20:11 gvanrossum

Finally please figure out if this is in uvloop or not. If it’s tied to uvloop this is the wrong tracker.

gvanrossum avatar Nov 18 '23 20:11 gvanrossum

Unfortunately, this bug was reproduced only in our production, and we weren't able to cause it in other environments.

It's already mitigated, as the root cause was in poor stability of connections.

I don't think I would be able to reproduce it again in production without uvloop, because it caused a lot of harm for our system.

But ready to assist and provide more details, if anyone wants to ask me some clarifying questions about it.

Also, there are several related tickets, and I'm not sure if they are all about uvloop:

  1. https://github.com/python/cpython/issues/106684 - very similar cpython issue, already closed though.
  2. https://github.com/aio-libs/aiohttp/issues/7252 - also, an already closed, but recent aiohttp issue, they were able to mitigate it somehow, but not sure if it was the same bug.

selevit avatar Nov 19 '23 07:11 selevit

I actually was able to reproduce this leak isolated several days ago, but it leaked for only ~5 megabytes in several hours. I adopted a repro script from this comment to the aiohttp bug mentioned above to be ready for run, and ran it for about 4 hours.

The initial memory consumption (RSS) had initially stabilised on 258120 bytes, but after several hours increased to ~264000. Hope it's the same bug.

#!/usr/bin/env python3
import aiohttp
import tracemalloc
import ssl
import asyncio

def init():
    timeout = aiohttp.ClientTimeout(connect=5, sock_read=5)
    ssl_ctx = ssl._create_unverified_context()
    conn = aiohttp.TCPConnector(ssl=ssl_ctx, enable_cleanup_closed=True)
    session = aiohttp.ClientSession(connector=conn, timeout=timeout, cookie_jar=aiohttp.CookieJar(unsafe=True))
    return session

async def fetch(client):
    try:
        async with client.request('GET', url=f'https://api.pro.coinbase.com/products/BTC-USD/ticker') as r:
            msg = await r.text()
            print(msg)
    except asyncio.CancelledError:
        raise
    except Exception as err:
        print("error", err)
        pass

async def main(): 
    requests = 600
    clients = [init() for _ in range(requests)]
    tracemalloc.start()
    try:
        while True:
            await asyncio.gather(*[fetch(client) for client in clients])
            await asyncio.sleep(5)
    except asyncio.CancelledError:
        pass  # end and clean things up
    finally:
        memory_used = tracemalloc.get_traced_memory()
        snapshot = tracemalloc.take_snapshot()
        stats = snapshot.statistics('lineno')
        for stat in stats[:10]:
            print(stat)
        try:
            await asyncio.gather(*[client.close() for client in clients.values()])
        except:
            pass

asyncio.run(main())

selevit avatar Nov 19 '23 07:11 selevit

Could you find a reproducer without aiohttp?

gvanrossum avatar Nov 19 '23 15:11 gvanrossum

Unsure if this is helpful at all or related but I came across this issue https://github.com/encode/uvicorn/discussions/2078 in which in the thread it is discussed/concluded that the issue of memory not being released is not isolated to uvicorn but seen in granian/gunicorn/hypercorn and as a result could be interpreter level (apologies for butchering the summary). Thread has some great charts/analysis though it is on the server level, however the different implementations w.r.t granian and uvicorn can help approximate where the issue might be surfacing if it's related. Example repo: https://github.com/Besedo/memory_issue/tree/main by @EBazarov Apologies in advance if it is unrelated and will remove the comment/create one in the right place.

gi0baro:

Given everything we stated in the upper replies, I suspect there's no much to do about this. Given the number of frameworks and servers tested, it just seems the Python interpreter is the culprit here, as even if the heap memory gets released, the RSS remains high.

Faolain avatar Nov 27 '23 19:11 Faolain

Upstream is pretty hard pressed to debug this (I'm no web developer or admin). Can I ask one of the framework owners or users who are struggling with this to help us reduce their example to the point where we have a small self-contained program that demonstrates the issue?

gvanrossum avatar Nov 27 '23 22:11 gvanrossum

Upstream is pretty hard pressed to debug this (I'm no web developer or admin). Can I ask one of the framework owners or users who are struggling with this to help us reduce their example to the point where we have a small self-contained program that demonstrates the issue?

i provided working example of leak, testing with ab gaves 1+gb/min leak, openssl version 1.1 i can provide access to a dedicated server where this bug happens (not only one server affected)

IMPORTANT: there is no problem if using python3.9 WITHOUT uvloop, with uvloop it leaks, without is not. BUT python3.12 with or without uvloop leaks even more...

rojamit avatar Dec 10 '23 23:12 rojamit

shorten the mvp even more

import asyncio, ssl, uvloop

class HTTP(asyncio.Protocol):
    
    def __init__(self):
        self.transport = None
        
    def connection_made(self, transport):
        self.transport = transport

    def data_received(self, data):
        self.transport.write(
            b'HTTP/1.1 200 OK\r\nContent-Length: 0\r\nConnection: keep-alive\r\n\r\n'
        )
        self.transport.close()
        
def make_tls_context():
    ctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
    ctx.load_cert_chain('cert.crt', 'cert.key')
    return ctx

tls_context = make_tls_context()
loop = uvloop.new_event_loop()
#loop = asyncio.new_event_loop()

async def start_server(loop):
    return await loop.create_server(
        HTTP, '127.0.0.1', 443, backlog=65535,
        ssl=tls_context)

loop.run_until_complete(start_server(loop))
loop.run_forever()

rojamit avatar Dec 10 '23 23:12 rojamit

Updated first post for clarity. Problem not with openssl itself, it leaks without uvloop even with python3.12

rojamit avatar Dec 10 '23 23:12 rojamit

example certificates certs.zip

rojamit avatar Dec 10 '23 23:12 rojamit

I only have a Mac, and I got the example to work and even managed to install the apache2 utils. I modified the example to print the process size once a second (rather than bothering with graphs) using psutil. Then it got messy, the ab tool often ends with an error. It does show a huge process size increase when ab sort-of works.

Now I've got an idea. Given the complexity of the ssl module (a C extension that wraps OpenSSL) it is much more likely that the problem (if there is really a problem) is due to the code in that extension module than in asyncio, whose TLS support is just a small amount of Python code.

So maybe someone who is interested in getting to the bottom of this issue could rewrite the example without the use of asyncio. Instead you could use the socketserver module. This is ancient tech, but showing the leak when just using this will prove that the problem is in the ssl extension. We can then try to narrow down what happens in the ssl module to find the leak (I'm suspecting there's some path through the code that doesn't release a buffer it owns, but that covers a lot of cases).

(If you can't figure out how to use socketserver with TLS, maybe just a basic multi-threader hand-written serving loop could work. In either case you have to make sure to follow all the rules for serving TLS, of course.)

gvanrossum avatar Dec 11 '23 21:12 gvanrossum

@gvanrossum Important note: Python 3.9.2 without uvloop (pure asyncio) doesn't leak, same code. But leaks using uvloop. As far i know, after Python 3.9 the SSL code is ported from uvloop to cpython?

Biggest leak i saw is at Python 3.12 (without uvloop)

rojamit avatar Dec 14 '23 20:12 rojamit

This doesn’t really help. We need someone to try and find which objects are being leaked.

gvanrossum avatar Dec 14 '23 20:12 gvanrossum

This doesn’t really help. We need someone to try and find which objects are being leaked.

i tried different python memory profilers with no result at all, idk how to create useful dump, but inside coredump are tons of server certificates info

rojamit avatar Dec 15 '23 18:12 rojamit

Might one of the magical functions in the gc module that give you all the objects be helpful?

gvanrossum avatar Dec 15 '23 19:12 gvanrossum

objgraph is great for this https://objgraph.readthedocs.io/en/stable/#memory-leak-example

graingert avatar Dec 15 '23 19:12 graingert

Sorry, what without repro (but I will try to do it anyway) but I also got the same issue with Tornado + SSL. And I can confirm, such a leak exist even without uvloop, and sometimes (if I response 404 to cloudflare proxy a lot) it's became significant. @rojamit sorry but in that case you should try valgrind, I also will try on weekend PS @rojamit do your project use cloudflare? I found a very annoying behavior bounded with keep-alive and which can also produce illusion of memory leak.

stalkerg avatar Dec 20 '23 01:12 stalkerg

Summary

Findings:

  • This appears to be an issue when the SSL handshake fails here
  • When you turn of SSL verification, the memory leak won't be detected.
  • The key memory allocation that is not being cleaned is at sslproto.py:275

Minimum replication

This snippet runs the server and runs two separate pings (one insecure) in a subprocess, capturing the tracemalloc snapshots before and after each ping.

The output is as follows

Δ Memory Allocations = 2727.56kb
Δ Memory Allocations = 6.66kb # without SSL

Notice the n_iter=10

import asyncio
import asyncio.sslproto
import ssl
import tracemalloc

class HTTP(asyncio.Protocol):
    def __init__(self):
        self.transport = None

    def connection_made(self, transport):
        self.transport = transport

    def data_received(self, data):
        self.transport.write(
            b"HTTP/1.1 200 OK\r\nContent-Length: 0\r\nConnection: keep-alive\r\n\r\n"
        )
        self.transport.close()


def make_tls_context():
    ctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
    ctx.load_cert_chain(".jamie/iss109534/server.crt", ".jamie/iss109534/server.key")
    return ctx


async def start_server(loop):
    tls_context = make_tls_context()

    return await loop.create_server(
        HTTP, "127.0.0.1", 4443, backlog=65535, ssl=tls_context, start_serving=True
    )


async def ping(delay: float = 1.0, n_iter: int = 1, insecure: bool = False):

    await asyncio.sleep(delay)

    # ----------------------------------------------------------------------------------
    # Before
    current_1, _ = tracemalloc.get_traced_memory()

    # ----------------------------------------------------------------------------------
    # Run a single request
    if insecure:
        cmd = "curl --insecure"
    else:
        cmd = "curl"

    for _ in range(n_iter):
        proc = await asyncio.create_subprocess_shell(
            f"{cmd} https://127.0.0.1:4443",
            stderr=asyncio.subprocess.PIPE,
            stdout=asyncio.subprocess.PIPE
        )
        await proc.communicate()

    # ----------------------------------------------------------------------------------
    # After
    current_2, _ = tracemalloc.get_traced_memory()

    print(f"Δ Memory Allocations = {(current_2 - current_1)/1000:.2f}kb")


if __name__ == "__main__":
    tracemalloc.start()

    loop = asyncio.new_event_loop()
    loop.run_until_complete(start_server(loop))
    # Run with SSL verification
    loop.run_until_complete(ping(delay=0.5, n_iter=10))
    # Run without SSL verification
    loop.run_until_complete(ping(delay=0.5, insecure=1, n_iter=10))

    loop.close()

Trace malloc snapshot

I updated ping to also take a tracemalloc.take_snapshot before and after and we see that the key allocation is here: sslproto.py:275

async def ping(delay: float = 1.0, n_iter: int = 1, insecure: bool = False):

    # ...

    snapshot_1 = tracemalloc.take_snapshot()

    # ...
    # Same as before
    # ....

    snapshot_2 = tracemalloc.take_snapshot()

    print('-'*40)
    if insecure:
        print("Insecure")

    print(f"Δ Inner Memory Allocations = {(current_2 - current_1)/1000:.2f}kb")

    top_stats = snapshot_2.compare_to(snapshot_1, 'traceback')
    print("\n[ Top stat ]")
    for stat in top_stats[:1]:
        print(stat)
        for line in stat.traceback.format(limit=25):
            print("\t", line)

    print('-'*40)

And, now we can see where the allocations are happening (with n_iter=1)

Δ Inner Memory Allocations = 283.75kb

[ Top stat ]
/Users/jamphan/git/github.com/ordinary-jamie/cpython/.jamie/iss109534/tmp02.py:91: size=256 KiB (+256 KiB), count=2 (+2), average=128 KiB
           File "/Users/jamphan/git/github.com/ordinary-jamie/cpython/.jamie/iss109534/tmp02.py", line 91
             loop.run_until_complete(ping(delay=0.5))
           File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/base_events.py", line 671
             self.run_forever()
           File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/base_events.py", line 638
             self._run_once()
           File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/base_events.py", line 1971
             handle._run()
           File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/events.py", line 84
             self._context.run(self._callback, *self._args)
           File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/selector_events.py", line 224
             transport = self._make_ssl_transport(
           File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/selector_events.py", line 83
             ssl_protocol = sslproto.SSLProtocol(
           File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/sslproto.py", line 275
             self._ssl_buffer = bytearray(self.max_size)

Dev Mode

We can further confirm this when we run python -X dev we see the following warnings

Error on transport creation for incoming connection
handle_traceback: Handle created at (most recent call last):
  File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/base_events.py", line 1963, in _run_once
    handle._run()
  File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/events.py", line 84, in _run
    self._context.run(self._callback, *self._args)
  File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/selector_events.py", line 966, in _read_ready
    self._read_ready_cb()
  File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/selector_events.py", line 998, in _read_ready__get_buffer
    self._protocol.buffer_updated(nbytes)
  File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/sslproto.py", line 436, in buffer_updated
    self._do_handshake()
  File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/sslproto.py", line 561, in _do_handshake
    self._on_handshake_complete(exc)
  File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/sslproto.py", line 586, in _on_handshake_complete
    self._wakeup_waiter(exc)
  File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/sslproto.py", line 366, in _wakeup_waiter
    self._waiter.set_exception(exc)
protocol: <__main__.HTTP object at 0x1039294b0>
transport: <asyncio.sslproto._SSLProtocolTransport object at 0x103d6a210>
Traceback (most recent call last):
  File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/selector_events.py", line 235, in _accept_connection2
    await waiter
  File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/sslproto.py", line 576, in _on_handshake_complete
    raise handshake_exc
  File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/asyncio/sslproto.py", line 557, in _do_handshake
    self._sslobj.do_handshake()
  File "/Users/jamphan/.pyenv/versions/3.12.1/lib/python3.12/ssl.py", line 917, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: TLSV1_ALERT_UNKNOWN_CA] tlsv1 alert unknown ca (_ssl.c:1000)

Edit

Using curl --cacert /path/to/cert.crt as opposed to curl -k will get the same results; just to confirm the insecure flag isn't an invalid test.

Edit 2, updated minimal replication

A simpler code example for replicating the issue...

import asyncio
import ssl
import tracemalloc


async def main(certfile, keyfile):
    tracemalloc.start()

    # Start server with SSL
    ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
    ssl_context.load_cert_chain(certfile, keyfile)
    await asyncio.start_server(
        lambda _, w: w.write(b"\0"), "127.0.0.1", "4443", ssl=ssl_context
    )

    current_1, _ = tracemalloc.get_traced_memory()

    # Ping the server with cURL
    proc = await asyncio.create_subprocess_shell(
        f"curl https://127.0.0.1:4443 2> /dev/null"
    )
    await proc.communicate()

    current_2, _ = tracemalloc.get_traced_memory()
    print(f"{(current_2-current_1)/1000:.2f}KB")


if __name__ == "__main__":

    asyncio.run(
        main(certfile="server.crt", keyfile="server.key")
    )

ordinary-jamie avatar Dec 20 '23 09:12 ordinary-jamie

@mjpieters Just curious -- have you ever experienced anything similar?

gvanrossum avatar Dec 20 '23 18:12 gvanrossum

@ordinary-jamie

The key memory allocation that is not being cleaned is at sslproto.py:275

Excellent find!! Can you think of a suitable place to free this memory? I think it must be passed into OpenSSL via a C wrapper and that C wrapper must somehow forget to free it? So it would have to be in the C code for _ssl where it receives that buffer.

I don't think the leak is in the Python code, even though that's where it's being allocated (Python objects don't easily leak attribute values).

gvanrossum avatar Dec 20 '23 20:12 gvanrossum

@mjpieters Just curious -- have you ever experienced anything similar?

I haven't, the code using asyncio HTTPS is relatively short-running (a few minutes at a time at most).

mjpieters avatar Dec 20 '23 22:12 mjpieters

Hi @gvanrossum, I actually think our culprit is this memoryview reference to the bytearray.

As a quick spike (see this draft PR: https://github.com/python/cpython/pull/113338), I added a .release() and cleared the references to the bytearray in what appears to be the final method before we wake up the future and set the exception SSLProto._fatal_error and that seems to have fixed the memory leak!

Re-running the example above

Δ Memory Allocations = 59.38kb # Was 2727.56kb before
Δ Memory Allocations = 6.20kb

This should be the sequence of method calls that lead to the memory leak before waking up the waiter Future:

SSLProto.connection_made
└── SSLProto._start_handshake
    └── SSLProto._do_handshake
        └── SSLProto._on_handshake_complete
            ├── SSLProto._fatal_error
            └── SSLProto._wakeup_waiter
                └── _waiter.set_exception

The set exception on the waiter Future object should boil up to the BaseSelectorEventLoop._accept_connection Task (which wraps _accept_connection2). This is where we create the SSLProtocol (and keeps a reference to the waiter future) via _make_ssl_transport.

SSL module doesn't have a reference to the bytearray

I'm not familiar, but from what I can tell the _ssl.c extension doesn't appear to have a reference to the bytearray; the SSLProtocol does have a reference to a SSLContext here asyncio.sslproto.SSLProtocol._sslcontext = asyncio.sslproto._create_transport_context(...) but this structure doesn't doesn't appear to hold any references to our bytearray

Socket weak reference is released

A quick side note, on the socket not being closed -- we can see that the sock reference is released in the do_handshake impl function. I'm not sure where we should be closing this:

$ lldb -f python.exe -- .jamie/iss109534/minimal.py

(lldb) breakpoint set --file Modules/_ssl.c --line 1033

(lldb) r
Process 83199 launched: '/Users/jamphan/git/github.com/ordinary-jamie/cpython/python.exe' (arm64)
1 location added to breakpoint 1
Process 83199 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x00000001021ad98c _ssl.cpython-313d-darwin.so`_ssl__SSLSocket_do_handshake [inlined] _ssl__SSLSocket_do_handshake_impl(self=0x0000000102e13750) at _ssl.c:1033:9 [opt]
   1030	    } while (err.ssl == SSL_ERROR_WANT_READ ||
   1031	             err.ssl == SSL_ERROR_WANT_WRITE);
   1032	    Py_XDECREF(sock);
-> 1033	    if (ret < 1)
   1034	        return PySSL_SetError(self, ret, __FILE__, __LINE__);
   1035	    if (PySSL_ChainExceptions(self) < 0)
   1036	        return NULL;

(lldb) frame variable sock
(PySocketSockObject *) sock = NULL

ordinary-jamie avatar Dec 20 '23 23:12 ordinary-jamie

Good points. There are other places where that view goes out of scope though -- I don't recall, doesn't memoryview automatically release its reference when it is deleted? If not, that should be a common cause of leaks, so maybe it does, and the problem is that the view itself is held on by something else?

E.g. doing a simple search I found that get_buffer() in sslproto.py overwrites both buffer and view -- should this also release the view? And what about cases that don't end with _fatal_error()? Why don't those leak? (Or do they?)

gvanrossum avatar Dec 20 '23 23:12 gvanrossum

Sorry for the slow response on this, @gvanrossum. I will continue looking at this, but with my limited knowledge, I am not making much progress (if someone more experienced wants to take over)

This might be my misunderstanding of CPython, but when I tried using tools such as MacOS's leaks I am not able to find any memory leaks.

Would this suggest that the reference to the SSLProtocol is simply being held on for a little longer than needed, so there is a delay in dereferencing the SSLProtocol object (and memoryview/bytearray buffers)?

To support this, here's a little test setup:

  1. Run a server as above in loop.run_forever()
  2. Monitor leaks output: pid=$(pgrep python.exe); watch -n 0.1 leaks $pid
  3. Load the server with k6: k6 run --vus 100000 --duration 1s load.js

As k6 progresses it will reach a high point in physical footprint, and eventually a chunk of the physical footprint will be deallocated (notice the delta between peak and current in the final output of leaks):

Physical footprint:         108.3M
Physical footprint (peak):  298.0M
Idle exit:                  untracked
----

leaks Report Version: 4.0
Process 75055: 54222 nodes malloced for 83935 KB
Process 75055: 0 leaks for 0 total leaked bytes.

ordinary-jamie avatar Jan 09 '24 21:01 ordinary-jamie

doing a simple search I found that get_buffer() in sslproto.py overwrites both buffer and view -- should this also release the view?

The buffer is created here _SelectorSocketTransport which in turn is passed back to SSLProtocol.buffer_updated. The ._incoming attribute is a ssl.MemoryBIO object which will release the buffer after it is done writing here

ordinary-jamie avatar Jan 09 '24 21:01 ordinary-jamie

I'm not surprised that leaks tool doesn't find much, Python has several layers of its own memory allocation on top of the system allocator.

But you're on to something: maybe you could rerun your test after adding a gc.collect() call to the end of _run_once() in base_events.py.

gvanrossum avatar Jan 10 '24 02:01 gvanrossum