faiss
faiss copied to clipboard
reconstruct_n() on MacOS causes system error
Summary
I am unable to use reconstructed index as a numpy array. The reconstruction itself succeeds, but then when I try to create new index it fails with system error message.
Reproduction instructions
import faiss
import numpy as np
d = 768
ncentroids = 15
niter = 2
faiss_index = faiss.IndexFlatL2(d)
x0 = np.random.random( (5000, d) )
faiss_index.add(x0)
x = faiss_index.reconstruct_n()
kmeans_index = faiss.Kmeans(d, ncentroids, niter=niter, verbose=True)
kmeans_index.train(x)
Error message Sampling a subset of 3840 / 5000 for training Clustering 3840 points in 768D to 15 clusters, redo 1 times, 2 iterations Preprocessing in 0.01 s
Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)
faiss-cpu == 1.8.0 Python 3.11.6 (v3.11.6:8b6ee5ba3b, Oct 2 2023, 11:18:21) [Clang 13.0.0 (clang-1300.0.29.30)] on darwin
Platform
Model Name: MacBook Pro Model Identifier: Mac14,9 Model Number: MPHE3LL/A Chip: Apple M2 Pro Total Number of Cores: 10 (6 performance and 4 efficiency) Memory: 16 GB System Firmware Version: 10151.81.1 OS Loader Version: 10151.81.1
I also found out that faiss fails to work after execution of the following code:
import nest_asyncio
nest_asyncio.apply()
Some problem with async applications.
I can't repro on 1.7.4, probably an installation error. Please fill in the issue template to show how Faiss was installed.
I will try to explain what happens more deeply.
if I run the following code:
from llama_index.vector_stores.faiss import FaissVectorStore
import faiss
faiss_index = faiss.IndexFlatL2(vector_length)
storage_context = StorageContext.from_defaults(
vector_store=FaissVectorStore(faiss_index=faiss_index))
storage_context.docstore.add_documents(nodes)
and then I will call somewhere else
import faiss
import numpy as np
d = 768
ncentroids = 15
niter = 2
faiss_index = faiss.IndexFlatL2(d)
x0 = np.random.random( (5000, d) )
faiss_index.add(x0)
x = faiss_index.reconstruct_n()
kmeans_index = faiss.Kmeans(d, ncentroids, niter=niter, verbose=True)
kmeans_index.train(x)
It fails.
I managed to resolve the issue by importing original faiss library first. The below code will work.
# !!!! don't delete, need to import here
import faiss
if __name__ == '__main__':
# run all functionality in any order here
It looks like the issue with faiss memory management.
One should import faiss package first, then other packages that depend on it:
import faiss
from llama_index.vector_stores.faiss import FaissVectorStore
So, switch the order, like above.
I am unable to reproduce the exit code 139. All of my attempts have resulted in exit code 0. I suggest you file the issue with https://github.com/run-llama/llama_index if the issue persists for you. Otherwise, feel free to file a new issue with fully reproducible code and install commands.