hnswlib
hnswlib copied to clipboard
Empty Index occupies large memory space
Hi! Thanks for this wonderful library!
Every empty Index with max_elements of 100 is taking 2.5 MB of RAM, which is quite expensive overhead for index object without data. Initially I measured this using psutil
, and later using resource
module (resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
).
Later I did another measure (just to be sure) by putting script into container and using docker stats command to see RAM usage. It turns out like this: For 1000 empty 512d indexes:
max_elements = 100; MEM = 2.487 GB max_elements = 1000; MEM = 2.530 GB max_elements = 10000; MEM = 2.905 GB max_elements = 100000; MEM = 6.603 GB
For reference my parameters are: M = 64, ef_construction = 400, dim = 512, space = "ip".
Am I missing something? Appreciate your replies!
Hi @levchik. Hm. Not sure what is happening. Maybe something python specific.Will try to look
I've looked into the code that happens because link_list_update_locks_
are initialized to 65536 elements regardless of the data size (cc @apoorv-sharma - we can probably fix that it would be a fraction of the dataset).
I wonder, does this pose a real problem in applications?
Hi, thanks for taking a look at this!
It is actually a problem for at least my use case, where I have multiple users of my app each having their data separated from each other, therefore each user has it's own Index instance and many users might potentially have very small amount of vectors stored like 10 or something.
Got it. If we merge #332 soon, will it work as a mitigation (brute force might be even faster for small indices and is more memory efficient)?
That's interesting idea. So in this case I'll have to use both brute-force and regular index - first, user starts with brute-force and the tricky part is to find out when to switch index to the regular one (assuming there will be not much data by that time, so that I can fit that in memory, and fill new index and remove old one).
I guess I'll need to do some benchmarking after that MR to see where it fits.
One thing that could be done is to reduce the number of locks for update. I think if the number of locks are reduced to half or even 1/4th, it should not impact performance by a lot. wdyt?