psycopg2
psycopg2 copied to clipboard
Potential memory leak with long-running client-side cursors
I noticed when using a long-running client-side cursor which executes many different templates that it can lead to a memory leak. I can try to provide a repro + tracemalloc but from memory the code is something like
def construct_values(n):
return ",".join("%s" for _ in range(n))
conn = psycopg2.connect()
c = conn.cursor()
q = queue.Queue() # could receive values from another coroutine/thread/process
while True:
item = q.get()
template = f"INSERT INTO ... VALUES ({construct_values(len(item))})"
c.execute("template", item)
This isn't necessarily a bug since cursors are kind of expected to be short-lived. But if this leak is reproducible then I think at least a note should be added to the docs saying that leaving a cursor open can lead to memory leaks.
I suspect that the issue is here, depending on the branching Py_INCREF(query) may be called twice. https://github.com/psycopg/psycopg2/blob/39f12bbfc529f16e28fc8b03dd4162bdfd85b10f/psycopg/cursor_type.c#L408-L444
EDIT: I think (but am not positive that) the first Py_INCREF(query) can be safely removed.
Hi, thank you for looking into this. The refcount seems right for me: after this if:
https://github.com/psycopg/psycopg2/blob/39f12bbfc529f16e28fc8b03dd4162bdfd85b10f/psycopg/cursor_type.c#L402-L411
if we take the if side, query and fquery will be two separate python objects each with 1 ref. If we take the else side they will refer to the same object, which will have 2 refs. In both cases, at the end of the function
https://github.com/psycopg/psycopg2/blob/39f12bbfc529f16e28fc8b03dd4162bdfd85b10f/psycopg/cursor_type.c#L453-L456
these two refs are decreased and the object could legit die. However, in case before exit we take this branch:
https://github.com/psycopg/psycopg2/blob/39f12bbfc529f16e28fc8b03dd4162bdfd85b10f/psycopg/cursor_type.c#L440-L444
the object acquires an extra ref and is associated to self. This transfers the ownership of the object to self
: the two decref won't cause the object's deallocation.
If any there could be a leak if self->query
is not decref'd correctly. However it seems that Py_CLEAR(self->query)
is always called before setting it a value, so at a first glance I don't know where the leak could be.
@dvarrazzo thank you for your clear explanation; I agree now the existing refcount looks right. I missed the Py_XDECREF in the exit block. I can do a bit more work on tracing the leak but might be a bit slow as I am busy with other projects.
@charles-cooper have you got any update on this?
No, sorry, haven't had a chance to look into this
@charles-cooper Is there any fix of the memory leak of this?
No memory leak has been shown in 18 months. This can be closed.