faiss icon indicating copy to clipboard operation
faiss copied to clipboard

memory leak when index.search

Open Jar7 opened this issue 4 years ago • 6 comments

import faiss
from facebank import *
import pickle
import time
import gc
#index = faiss.index_factory(512, "IVF16384, Flat")
index = faiss.read_index('log/features_exp2/feature_00550220.index')
index_ivf = faiss.extract_index_ivf(index)
index_ivf.nprobe = 50

with open('log/features_exp2/records.add.00000000_00023729.pkl','rb') as f:
    features = pickle.loads(f.read())[0].data #a np.array([1,512], np.float32)
for i in range(100000):
    rst = index.search(features,1)
    if i % 100 ==0:
        print(faiss.get_mem_usage_kb())

#output
1244664
1248024
1249692
1252228
1253488
1256024
1257284
1259820
1261080
1263616
1265988
1267676
1270200
1271472
1274008
1275268
...

Hello, Is that normal the memory keeps increasing when search? Is it a memory leak? faiss==1.7.1.post3

Jar7 avatar Dec 27 '21 10:12 Jar7

Depending on the memory allocation policy, it can be normal that memory is increasing. Does it blow up the memory usage eventually?

mdouze avatar Jan 11 '22 08:01 mdouze

Depending on the memory allocation policy, it can be normal that memory is increasing. Does it blow up the memory usage eventually?

No, it does not blow up. May I ask is the memory increasing rate related to the number of centroids?

Jar7 avatar Jan 11 '22 10:01 Jar7

Mem leak reproduced with the official faiss-cpu packages in conda. A simple IVF index will trigger the leak:

# leak.py

import gc

import faiss
import numpy as np


k = 10
n = 1000
d = 8
index = faiss.index_factory(d, 'IVF100,Flat')

xb = np.arange(n * d, dtype='float32').reshape(-1, d)
q = np.zeros((1, d), dtype='float32')

index.train(xb)
index.add(xb)
faiss.omp_set_num_threads(1)

for i in range(100000000):
    _ = index.search(q, 10)
    if i % 100 == 0:
        print(i, faiss.get_mem_usage_kb(), end='\r', flush=True, sep='\t')
    if i % 2000 == 0:
        gc.collect()

It's not extremely hard to reproduce:

$ conda create -n bad python=3.8 faiss-cpu -c pytorch # pytorch version needs MKL
$ conda activate bad
$ python leak.py # OOM killed eventually, depending on your available RAM

However, if using Openblas instead of MKL, everything seems simply perfect:

$ conda create -n good python=3.8 numpy "blas=*=openblas" # force to use Openblas as the backend of numpy
$ conda install -n good faiss-cpu -c pytorch
$ conda activate good
$ python leak.py # fine this time

After a quick investigation, I find previous versions before 1.6.5 are good with both Openblas and MKL, while 1.7.x series have a mem leak issue with MKL but work fine with Openblas, even with this latest fix.

Btw, with the help of git-bisect, this issue was first introduced by the commit 6d0bc58.

QwertyJack avatar Jan 19 '22 14:01 QwertyJack

Btw, with the help of git-bisect, this issue was first introduced by the commit 6d0bc58.

To correct myself, this issue is caused by this OMP statement: https://github.com/facebookresearch/faiss/commit/c5975cda#diff-a5957e17f0392d1e80c31eef1a637a13e418a94a0e9fc15ca377c980aa4b22e6R310

QwertyJack avatar Jan 24 '22 14:01 QwertyJack

@QwertyJack I also had the same problem. Is this bug fixed? my faiss-cpu=1.7.1.post2, I update faiss-cpu==1.8.0,but it dosen‘t work

tianjiahao avatar Jul 30 '24 06:07 tianjiahao

@mdouze would you please let us now? I also have the same problem.

FedericoM25 avatar Aug 14 '24 11:08 FedericoM25