hnswlib
hnswlib copied to clipboard
feature: autoresize
There should be a feature to handle the index size automatically, making it increase.
Example implementation:
while index.get_current_count() + len(to_add) > index.get_max_elements():
index.resize_index(2 * index.get_current_count())
On that note, what is the time complexity of a resize?
Agree. That would be nice! Complexity of resize is linear to the size of the dataset. Essentially it is an allocation, a copy and a deallocation.
Maybe it is worth to make a python wrapper over the class to support it. On the other hand, ideally, it should be done in C++. The technical problem with doing it is that the resize is not thread safe with insertion (e.g. some other threads, including python ones, need to finish before copying). This might be a part of larger overhaul of synchronization.
Nice for the complexity since autoresize would not change the asymptotic complexity and would just add a small factor (between 2 and 4).
Not sure how to solve the synchronization problem.