faiss
faiss copied to clipboard
GPU memory usage increases after creating new instances in different threads
Summary
Platform
OS:
Faiss version: 1.7.0
Installed from: pypip
Faiss compilation options:
Running on: I9 9900X - 32GB - RTX 2080ti
- [ ] CPU
- [ x ] GPU
Interface:
- [ ] C++
- [ x ] Python
Reproduction instructions
RuntimeError: Error in virtual void* faiss::gpu::StandardGpuResourcesImpl::allocMemory(const faiss::gpu::AllocRequest&) at /__w/faiss-wheels/faiss-wheels/faiss/faiss/gpu/StandardGpuResources.cpp:443: Error: 'err == cudaSuccess' failed: Failed to cudaMalloc 1610612736 bytes on device 0 (error 2 out of memory
Outstanding allocations
Each time I create a new instance, faiss-gpu is allocated a large amount of GPU memosy. https://github.com/HoangTienDuc/github_issues/blob/faiss/subcrib_message.py#L34 So with 11gb GPU memory I can only create 5 threads on it. How to solve this problem? I want to create more threads. Can you please take a check at https://github.com/HoangTienDuc/github_issues/tree/faiss#issue-detail
Any GPU indices that share a single GpuResources object can only be used one at a time from any thread. A single GpuResources object can only be operated on sequentially, if shared between different threads A single GpuResources object can be used on indices that reside on different GPUs, if the above two things are true.
A GpuResources object attempts to reserve 1.5 GB by default for each GPU that it is used on.
Examples:
create 4 GpuIndex instances on the same GPU, sharing the same GpuResources object.
OK: using these 4 indices sequentially in the same CPU thread. OK: using these 4 indices sequentially from different CPU threads (one guarantees that the top-level index call to one index in one CPU thread has returned before calling a different index call on a different CPU thread; i.e., there is mutual exclusion in the calls). NOT OK: using these 4 indices concurrently from different CPU threads, where index calls between the different threads may overlap.
create 2 GpuIndex instances on different GPUs, sharing the same GpuResources object.
OK: using these 2 indices sequentially in the same CPU thread. OK: using these 2 indices sequentially on different CPU threads NOT OK: using these 2 indices concurrently from different CPU threads.
I might consider adding a mutex to GpuResources itself so as to make this always safe, though you might not get the concurrency that you prefer when using different CPU threads.
Summary
Platform
OS:
Faiss version: 1.7.0
Installed from: pypip
Faiss compilation options:
Running on: I9 9900X - 32GB - RTX 2080ti
- [ ] CPU
- [ x ] GPU
Interface:
- [ ] C++
- [ x ] Python
Reproduction instructions
RuntimeError: Error in virtual void* faiss::gpu::StandardGpuResourcesImpl::allocMemory(const faiss::gpu::AllocRequest&) at /__w/faiss-wheels/faiss-wheels/faiss/faiss/gpu/StandardGpuResources.cpp:443: Error: 'err == cudaSuccess' failed: Failed to cudaMalloc 1610612736 bytes on device 0 (error 2 out of memory Outstanding allocations
Each time I create a new instance, faiss-gpu is allocated a large amount of GPU memosy. https://github.com/HoangTienDuc/github_issues/blob/faiss/subcrib_message.py#L34 So with 11gb GPU memory I can only create 5 threads on it. How to solve this problem? I want to create more threads. Can you please take a check at https://github.com/HoangTienDuc/github_issues/tree/faiss#issue-detail
Did you solve it? I also want to use Faiss on multi gpu with multi threads and I don't know how to use it correctly.