atlas icon indicating copy to clipboard operation
atlas copied to clipboard

Running Atlas on small GPU's.

Open prasad4fun opened this issue 1 year ago • 2 comments

Hi,

In the blog and paper its mentioned with faiss-pq code size 64 it needs as little as 2GB. I keep getting cuda out of memory with 12 GB gpu while trying to finetune_qa with faiss-pq code 64 and models/atlas_nq/base.

what is the minimum GPU size requirement for running atlas model during finetuning qa and at inference time?

prasad4fun avatar Apr 03 '23 07:04 prasad4fun

Something is up with the finetune code. Even 2x40GB with base model and code size 1, GPU mem hits 25GB then tries to allocate 25GB more and OOM.

  File "/home/amicus/atlas/src/index.py", line 111, in load_index
    self.embeddings = torch.concat(embeddings, dim=1)
RuntimeError: CUDA out of memory. Tried to allocate 22.99 GiB (GPU 1; 47.54 GiB total capacity; 23.00 GiB already allocated; 22.92 GiB free; 23.00 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

kungfu-eric avatar Feb 13 '24 21:02 kungfu-eric

is it possible to train and test a model using the free version of Google Colab without the need for high-end GPU? I'm a student and I want to train and test this model.

DanialPahlavan avatar May 25 '24 15:05 DanialPahlavan