faiss
faiss copied to clipboard
Unable to write IndexFlatIP in faiss-gpu
Summary
I am using Faiss to retrieve similar products. My embedding size is 1024. I tried faiss-cpu but it was too slow. Hence, I am trying faiss-gpu. However, in my experiments, I am unable to write an IndexFlatIP index. I was able to use write_index() in faiss-cpu.
Platform
OS: Ubuntu 20.04.5 LTS
Faiss version: faiss-gpu: 1.7.2
Installed from: pip
Running on:
- [ ] CPU
- [x] GPU
Interface:
- [ ] C++
- [x] Python
Reproduction instructions
Code
import numpy as np
import faiss
def create_dummy_embeddings():
# Create dummy embeddings
embeddings = np.random.rand(100000, 1024).astype('float32')
print('Dummy embeddings created:', embeddings.shape)
return embeddings
def create_index(embeddings):
# Create index
EMBEDDING_SIZE = 1024
print('Embedding size:', EMBEDDING_SIZE)
# GPU
res = faiss.StandardGpuResources() # use a single GPU
ngpus = faiss.get_num_gpus()
print("Number of GPUs:", ngpus)
cpu_index = faiss.IndexFlatIP(EMBEDDING_SIZE)
gpu_index = faiss.index_cpu_to_gpu(res, 0, cpu_index)
gpu_index.add(embeddings)
print('Number of vectors in index:', gpu_index.ntotal)
faiss.write_index(gpu_index, 'faiss_index_dummy.index')
print('Index saved to file.')
def load_index():
index = faiss.read_index('faiss_index_dummy.index')
print('Index loaded from file.')
print('Number of vectors in index:', index.ntotal)
def main():
embeddings = create_dummy_embeddings()
create_index(embeddings)
load_index()
if __name__ == '__main__':
main()
Error
RuntimeError: Error in void faiss::write_index(const faiss::Index*, faiss::IOWriter*) at /project/faiss/faiss/impl/index_write.cpp:590: don't know how to serialize this type of index
Hoping to get this fixed! Thank you!
GPU indexes cannot be stored directly. Please convert to CPU first.
So basically:
- Create a cpu_index
- Convert it to a gpu_index
- Add vectors to the gpu_index
- Convert gpu_index back to cpu_index
- Write cpu_index
Am I correct?
And then when trying to use the index:
- Read in the cpu_index
- Convert to gpu_index
- Perform searches
Did I get this right?