algoliasearch-client-python
algoliasearch-client-python copied to clipboard
Increasing memory usage when using replace_all_objects
Hello, I have a memory issue when using replace_all_objects. When using this function with a significant amount of documents (5 Million), I use an iterator to minimize memory consumption.
I expect the memory usage to stay flat during the operation, however it keeps increasing. (cf image below)
Upon investigation, it looks like the cause of this memory usage increase comes from the function SearchIndex._chunk, and more specifically the list raw_responses, which stores responses for every request sent.
https://github.com/algolia/algoliasearch-client-python/blob/3bb9108d9dff627f12c921ad23dab02984f70a44/algoliasearch/search_index.py#L505-L528
This is a problem because the response of /1/indexes/{indexName}/batch contains the list of objectIDs
{
"taskID": 792,
"objectIDs": ["6891", "6892"]
}
With 5M documents, each with an objectID of ~15 characters, this accounts for 300MB.
>>> sys.getsizeof("123456789012345") * 5_000_000 / (1024**2)
305.17578125
Is there a request_option for the API not to return objectIDs, or for the code not to store them in raw_responses ?
Thank you 🙏