aerospike-client-python
aerospike-client-python copied to clipboard
Memory Leak while getting data from aerospike
The tracemalloc pythonic tools shows that calling the get function on the aerospike client causes a memory leak.
how I use the client:
I connect to the aerospike cluster by passing a list of hosts,
then in a loop (multiple iteration per second) I have multiple threads that simultaneously read some records from multiple sets.
Hi @smartist1401 I'll investigate this issue today and follow up with you soon.
Would you mind sharing the script so I can debug it?
Hi @juliannguyen4 Did you find anything? Thank you for checking faster, please. I have used your aerospike client in a serious project.
Hey @smartist1401, I have found a few leaks from get() caused by several different sources. I'll try to fix them and will give you a status update by tomorrow at 5 PM PST.
My notes about the get() leaks:
There’s 2 sources for memory leaks with get(): raise_exception() and record_to_pyobject().
I investigated the key_to_pyobject() leaks coming from record_to_pyobject() and I haven’t found the cause yet. The leaks are coming from the namespace, set, and digest of the key tuple.
gdb -args python3 -m pytest new_tests/
# Using Python client 13.0.0
b src/main/conversions.c:1840
cond 2 py_namespace->ob_refcnt > 1 || py_set->ob_refcnt > 1 || py_digest->ob_refcnt > 1
# Test succeeds without breaking
I also added a breakpoint to check the reference counts of the record tuple’s objects as well as the reference count of the record tuple itself. None of them exceeded 1 when being returned by record_to_pyobject() and AerospikeClient_GetInvoke() respectively.
Using sys.getrefcount() doesn’t show any memory leaks either:
>>> import aerospike
>>> config = {"hosts": [("127.0.0.1", 3000)]}
>>> client = aerospike.client(config).connect()
>>> key = ("test", "demo", 1)
>>> client.put(key, {"a": 1})
0
>>> rec = client.get(key)
>>> import sys
>>> sys.getrefcount(rec)
2
>>> sys.getrefcount(rec[2])
2
>>> sys.getrefcount(rec[1])
2
>>> sys.getrefcount(rec[0])
2
@smartist1401 When you ran that script and saw the memory leaks reported by tracemalloc, did the records that were being queried exist on the server?
Hey @smartist1401, I worked on fixing the memory leak today but I wasn't able to fully solve it. I'll let you know once I have found a solution
@smartist1401 When you ran that script and saw the memory leaks reported by tracemalloc, did the records that were being queried exist on the server?
Hi @juliannguyen4 Thanks for the investigation I get a record by key in a try section and return the record if exist, in except (not found record exception) I return an empty dict in except section.
some records exist and some other not.
Got it. As I mentioned in my comment above, there's a memory leak from raise_exception(), which should be called when calling get() on a record that does not exist. I'll try to fix this memory leak and provide you with a build with the fix.
Hi @juliannguyen4 I'm Waiting ... :)
I'm still getting to the bottom of it.
Hi @juliannguyen4 Recently, the new version 14.0.0 has been released. Has the memory leak problem been fixed in this version?