faiss icon indicating copy to clipboard operation
faiss copied to clipboard

Help! core dump: trained on gpu, search on disk

Open Jar7 opened this issue 2 years ago • 4 comments

UPDATE: Sorry I found it occurs even with faiss-cpu. There is nothing to do with gpu.(Although I did uninstall faiss-gpu and reinstall faiss-cpu). Here is the code. You can run and repro it.


Hello, Can an index trained on gpu be merged to disk index? run the following code and got core dump.

core dump issue

import faiss
import numpy as np
from faiss.contrib.ondisk import merge_ondisk

index = faiss.index_factory(512, "IDMap,IVF32,Flat", faiss.METRIC_INNER_PRODUCT)
#index = faiss.index_cpu_to_all_gpus(index)
npy = np.random.rand(100,512).astype(np.float32)
index.train(npy)
#index = faiss.index_gpu_to_cpu(index)
faiss.write_index(index, 'trained_ivf32.index')

feats = np.random.rand(100,512).astype(np.float32)
index.add_with_ids(feats, np.array(range(len(feats))))
faiss.write_index(index, 'index_mem.index')

index = faiss.read_index('trained_ivf32.index')
out_ivfdata='trained_ivf32.ivfdata'
out_index = 'trained_ivf32_disk.index'
merge_ondisk(index, ['index_mem.index'], out_ivfdata)
print('merged', out_ivfdata)
faiss.write_index(index, out_index)

feat = np.random.rand(1,512).astype(np.float32)
index1 = faiss.read_index(out_index)
rst = index1.search(feat, 100)

faiss version: faiss-cpu==1.7.2

Jar7 avatar Jun 16 '22 11:06 Jar7

Hello, I rewrite the code, which can be reproduced. Please help~~ @mdouze

Jar7 avatar Jun 20 '22 09:06 Jar7

I found it occurs with IDMap, is there a bug with IDMap?

Jar7 avatar Jun 20 '22 10:06 Jar7

I can repro the issue. Will look into it. However, why use IDMap ? It does not make much sense with an IVF index.

https://github.com/facebookresearch/faiss/wiki/Pre--and-post-processing#ids-in-the-indexivf

mdouze avatar Jul 08 '22 15:07 mdouze

I have run into this bug as well:

res = faiss.StandardGpuResources()
flat_config = faiss.GpuIndexFlatConfig()
flat_config.device = 0
d = init.shape[1]
index = faiss.GpuIndexFlatIP(res, d, flat_config)
index = faiss.IndexIDMap(index)
cpu_index = faiss.index_gpu_to_cpu(index)

Error:

Faiss assertion 'err == CUBLAS_STATUS_SUCCESS' failed in void faiss::gpu::runMatrixMult(faiss::gpu::Tensor<float, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<IndexType, 2, true>&, bool, float, float, cublasHandle_t, cudaStream_t) [with AT = float; BT = float; cublasHandle_t = cublasContext*; cudaStream_t = CUstream_st*] at /home/conda/feedstock_root/build_artifacts/faiss-split_1685210641191/work/faiss/gpu/utils/MatrixMult-inl.cuh:254; details: cublas failed (7): (64, 4096) x (5183, 4096)' = (64, 5183) gemm params m 5183 n 64 k 4096 trA T trB N lda 4096 ldb 4096 ldc 5183
Aborted (core dumped)

nfrumkin avatar Jan 25 '24 04:01 nfrumkin