psycopg2 icon indicating copy to clipboard operation
psycopg2 copied to clipboard

Potential memory leak with long-running client-side cursors

Open charles-cooper opened this issue 3 years ago • 5 comments

I noticed when using a long-running client-side cursor which executes many different templates that it can lead to a memory leak. I can try to provide a repro + tracemalloc but from memory the code is something like

def construct_values(n):
  return ",".join("%s" for _ in range(n))

conn = psycopg2.connect()
c = conn.cursor()
q = queue.Queue()  # could receive values from another coroutine/thread/process
while True:
  item = q.get()
  template = f"INSERT INTO ... VALUES ({construct_values(len(item))})"
  c.execute("template", item)

This isn't necessarily a bug since cursors are kind of expected to be short-lived. But if this leak is reproducible then I think at least a note should be added to the docs saying that leaving a cursor open can lead to memory leaks.

charles-cooper avatar Jul 04 '21 15:07 charles-cooper

I suspect that the issue is here, depending on the branching Py_INCREF(query) may be called twice. https://github.com/psycopg/psycopg2/blob/39f12bbfc529f16e28fc8b03dd4162bdfd85b10f/psycopg/cursor_type.c#L408-L444

EDIT: I think (but am not positive that) the first Py_INCREF(query) can be safely removed.

charles-cooper avatar Jul 04 '21 15:07 charles-cooper

Hi, thank you for looking into this. The refcount seems right for me: after this if:

https://github.com/psycopg/psycopg2/blob/39f12bbfc529f16e28fc8b03dd4162bdfd85b10f/psycopg/cursor_type.c#L402-L411

if we take the if side, query and fquery will be two separate python objects each with 1 ref. If we take the else side they will refer to the same object, which will have 2 refs. In both cases, at the end of the function

https://github.com/psycopg/psycopg2/blob/39f12bbfc529f16e28fc8b03dd4162bdfd85b10f/psycopg/cursor_type.c#L453-L456

these two refs are decreased and the object could legit die. However, in case before exit we take this branch:

https://github.com/psycopg/psycopg2/blob/39f12bbfc529f16e28fc8b03dd4162bdfd85b10f/psycopg/cursor_type.c#L440-L444

the object acquires an extra ref and is associated to self. This transfers the ownership of the object to self: the two decref won't cause the object's deallocation.

If any there could be a leak if self->query is not decref'd correctly. However it seems that Py_CLEAR(self->query) is always called before setting it a value, so at a first glance I don't know where the leak could be.

dvarrazzo avatar Jul 04 '21 21:07 dvarrazzo

@dvarrazzo thank you for your clear explanation; I agree now the existing refcount looks right. I missed the Py_XDECREF in the exit block. I can do a bit more work on tracing the leak but might be a bit slow as I am busy with other projects.

charles-cooper avatar Jul 05 '21 21:07 charles-cooper

@charles-cooper have you got any update on this?

dvarrazzo avatar Sep 07 '21 11:09 dvarrazzo

No, sorry, haven't had a chance to look into this

charles-cooper avatar Sep 07 '21 15:09 charles-cooper

@charles-cooper Is there any fix of the memory leak of this?

JerryKwan avatar Feb 03 '23 05:02 JerryKwan

No memory leak has been shown in 18 months. This can be closed.

dvarrazzo avatar Feb 03 '23 06:02 dvarrazzo