[Bug]: memory leak due to exception
What happened?
When the following exception is triggered, we noticed that the memory will not be released after this query end. In this case, after many such queries, the memory is used up.
if (result.size() != k) throw std::runtime_error( "Cannot return the results in a contigious 2D array. Probably ef or M is too small");
Versions
ChromaV0.5.13
Relevant log output
No response
@SarielMa, thanks for reporting this. Need to do a bit more digging whether the leak (if existing) happens in the pybind11 or in the python code.
Hey @SarielMa! Can you share more information on your setup? Are you running this on a Windows machine by chance?
hey @SarielMa, do you have an observation as to how long it takes for the memory to be consumed? e.g. time or number of queries?
I can demonstrably show that, indeed the hnsw lib has a memory leak when an exception is thrown. Here's the call stack sequence:
- allocate result - https://github.com/chroma-core/hnswlib/blob/1aaa5e12320d0bf39844d4fb4fd9504c272c44e2/python_bindings/bindings.cpp#L689-L690
- start parallel execution for each item in the query (most of the time this is just 1)
- wait for threads to complete (https://github.com/chroma-core/hnswlib/blob/1aaa5e12320d0bf39844d4fb4fd9504c272c44e2/python_bindings/bindings.cpp#L76)
- check for exceptions - if found rethrow it (https://github.com/chroma-core/hnswlib/blob/1aaa5e12320d0bf39844d4fb4fd9504c272c44e2/python_bindings/bindings.cpp#L80)
- function exits without ever registering the
py::capsulefree callbacks - https://github.com/chroma-core/hnswlib/blob/1aaa5e12320d0bf39844d4fb4fd9504c272c44e2/python_bindings/bindings.cpp#L735-L738
here's some python code to reproduce exactly the error you encounter and once we find the right embedding to reproduce it with we spam query and printout some stats. Note that this runs for very long time (adjust the x loop):
import gc
import uuid
import tracemalloc
import chromadb
import numpy as np
import psutil
tracemalloc.start()
np.random.seed(42)
process = psutil.Process()
data = np.random.uniform(-1, 1, (1000, 500, 384))
client = chromadb.PersistentClient("contiguous2d")
# client = chromadb.HttpClient()
collection = client.get_or_create_collection("test_collection")
for i in range(data.shape[0]):
try:
print("Iteration: ", str(i))
gc.collect()
ids = [f"{uuid.uuid4()}" for i in range(data[i].shape[0])]
collection.add(ids=ids, embeddings=data[i])
random_embeddings = [data[i][np.random.choice(data[i].shape[0])].tolist()]
collection.query(query_embeddings=data[i], n_results=10)
collection.delete(ids=ids)
gc.collect()
except Exception as e:
print(e)
snapshot1 = tracemalloc.take_snapshot()
memory_info = process.memory_info()
# Print the memory usage
print(f"RSS: {memory_info.rss / (1024 ** 2):.2f} MB")
print(f"VMS: {memory_info.vms / (1024 ** 2):.2f} MB")
for x in range(100000):
try:
# print("leak check ",x)
collection.query(query_embeddings=data[i], n_results=500)
except Exception as e1:
# Get the memory info
continue
memory_info = process.memory_info()
# Print the memory usage
print(f"RSS: {memory_info.rss / (1024 ** 2):.2f} MB")
print(f"VMS: {memory_info.vms / (1024 ** 2):.2f} MB")
snapshot2 = tracemalloc.take_snapshot()
stats = snapshot2.compare_to(snapshot1, 'lineno')
for stat in stats[:20]: # Top 10 changes
print(stat)
raise e
RSS grows over time.
I've also created a sample cpp code to reproduce the conditions - https://gist.github.com/tazarov/71fe6f2e8d5947dde998e83ee9a57d0a
Overall yes there seems to be a leak when the exception is raised but the leak is quite small (120 bytes with defaults) for 99.99% of the cases. The following two criteria can affect the size of the leak:
- very large
n_results-kin the cpp binding - lots of queries
query_embeddings(orquery_texts) -rowsin the cpp binding
The most common case involves 1 single query embedding and default n_results=10. This results in 10 * 8 bytes (on 64bit) for labeltype which is size_t and 10 * 4 bytes for dist_t which is float totaling 120 bytes per exception per query.