Summary

Platform

OS: Linux Ubuntu 18.04

Faiss version: faiss 1.7.2

Installed from: conda install faiss-gpu

Faiss compilation options:

Running on:

[ ] CPU
[√] GPU

Interface:

[ ] C++
[ √] Python

Reproduction instructions

Error info:

Error in virtual void* faiss::gpu::StandardGpuResourcesImpl::allocMemory(const faiss::gpu::AllocRequest&) at /home/conda/feedstock_root/build_artifacts/faiss-split_1663108094389/work/faiss/gpu/StandardGpuResources.cpp:452: Error: 'err == cudaSuccess' failed: StandardGpuResources: alloc fail type FlatData dev 0 space Device stream 0x55dfe2df9eb0 size 42466784256 bytes (cudaMalloc error out of memory [2])

It seem that GPU out of memory. BUT!!!! I use a A100, and it has 80GB memory, when use 42GB, this error happen. I don't know why, please help me.
nvidia-smi:

Mon Oct  3 08:30:34 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.129.06   Driver Version: 470.129.06   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A100-SXM...  On   | 00000000:CF:00.0 Off |                    0 |
| N/A   31C    P0    74W / 400W |  42174MiB / 81251MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Oct 03 '22 08:10 yangsp5

Could you provide more details about what methods you are using from faiss-gpu? A minimal working example would be useful so we can guide you pinpoint the issue @yangsp5 , thanks!

Oct 04 '22 20:10 mlomeli1

the code:

import faiss
import numpy as np


dim = 768
num_threads = 10
use_gpu = True

faiss.omp_set_num_threads(num_threads)
index = faiss.IndexFlatIP(dim)
index = faiss.IndexIDMap(index) 
if use_gpu:
    index = faiss.index_cpu_to_all_gpus(index)


# insert into faiss with ids
for fp in filepaths:
    data = torch.load(fp)
    for embeddings, ids in data:
        # embeddings.shape  ---> [256000, 768]
        # ids ---> [1, 2, 3, 4, ....., 256000]
        index.add_with_ids(embeddings, np.array(ids))

Oct 08 '22 02:10 yangsp5

This is due to the geometric doubling behavior in faiss::gpu::DeviceVector's append, which happens when you call add on an index which already has data in it:

https://github.com/facebookresearch/faiss/blob/main/faiss/gpu/utils/DeviceVector.cuh#L94

This is something that we will fix so above a certain size of allocation it will no longer double but instead only increase by a much smaller factor, or even be exactly sized.

In the meantime, even though this is a very large amount of data (more than 40 GB), it might be possible to avoid the issue by loading all data on the CPU and only calling add_with_ids at once. This is something you might be able to do, but I understand the constraints.

Also, the Faiss indices allow direct usage of Torch tensors if you import faiss.contrib.torch_utils, so then you can pass Torch tensors directly.

Oct 10 '22 22:10 wickedfoo

thanks

Oct 18 '22 07:10 yangsp5

faiss
faiss copied to clipboard

cudaMalloc error out of memor Error

Summary

Platform

Reproduction instructions

faiss faiss copied to clipboard

cudaMalloc error out of memor Error

Summary

Platform

Reproduction instructions

faiss
faiss copied to clipboard