boltz icon indicating copy to clipboard operation
boltz copied to clipboard

Ran out of memory

Open xinyu-dev opened this issue 1 year ago • 9 comments

I tested the 0.3.0 with different sequences and GPU. In some cases I see ran out of memory, skipping batch issue. Here are the details:

Command:

! boltz predict input/<specific_yaml_file> --out_dir output --devices 1 --output_format pdb --use_msa_server --num_workers 2

Input 1: keytruda.yaml:

version: 1
sequences:
  - protein:
      id: A
      sequence: EIVLTQSPATLSLSPGERATLSCRASKGVSTSGYSYLHWYQQKPGQAPRLLIYLASYLESGVPARFSGSGSGTDFTLTISSLEPEDFAVYYCQHSRDLPLTFGGGTKVEIK
  - protein:
      id: B
      sequence: VQLVQSGVEVKKPGASVKVSCKASGYTFTNYYMYWVRQAPGQGLEWMGGINPSNGGTNFNEKFKNRVTLTTDSSTTTAYMELKSLQFDDTAVYYCARRDYRFDMGFDYWGQGTTVTVSS
  • AWS SageMaker g5.8xlarge (A10-24G), p3.2xlarge(V100-16G): out of memory
  • AWS SageMaker g6.8xlarge (L4-24G): Good

Input 2:

adalimumab.yaml:

version: 1
sequences:
  - protein:
      id: A
      sequence: DIQMTQSPSSLSASVGDRVTITCRASQGIRNYLAWYQQKPGKAPKLLIYAASTLQSGVPSRFSGSGSGTDFTLTISSLQPEDVATYYCQRYNRAPYTFGQGTKVEIK
  - protein:
      id: B
      sequence: EVQLVESGGGLVQPGRSLRLSCAASGFTFDDYAMHWVRQAPGKGLEWVSAITWNSGHIDYADSVEGRFTISRDNAKNSLYLQMNSLRAEDTAVYYCAKVSYLSTASSLDYWGQGTLVTVSS 
  • AWS SageMaker g5.8xlarge (A10-24G), p3.2xlarge(V100-16G), g6.8xlarge (L4-24G): out of memory
  • DGX A100-80G: Good

input 3: example_multimer_from_your_repo:

version: 1  # Optional, defaults to 1
sequences:
  - protein:
      id: A
      sequence: MAHHHHHHVAVDAVSFTLLQDQLQSVLDTLSEREAGVVRLRFGLTDGQPRTLDEIGQVYGVTRERIRQIESKTMSKLRHPSRSQVLRDYLDGSSGSGTPEERLLRAIFGEKA
  - protein:
      id: B
      sequence: MRYAFAAEATTCNAFWRNVDMTVTALYEVPLGVCTQDPDRWTTTPDDEAKTLCRACPRRWLCARDAVESAGAEGLWAGVVIPESGRARAFALGQLRSLAERNGYPVRDHRVSAQSA
  • AWS SageMaker g5.8xlarge (A10-24G)), p3.2xlarge(V100-16G), g6.8xlarge (L4-24G) and DGX A100-80G: All good

My feeling is that the main difference is in the GPU memory. e.g. growing from 24G to 80G does solve the OOM issue of some larger complexes. But the type of GPU (e.g. A10-24G vs L4-24G) might lead to different outcomes as well.

xinyu-dev avatar Dec 03 '24 14:12 xinyu-dev

same here running into OOM issue for a multimer on an RTX4090 24GB

YogBN avatar Dec 03 '24 16:12 YogBN

I also suspect there is something with the data returned by the public Colab MMseq server. Tamarind Bio's Boltz server is seemingly running on a single L4 GPU and has no issue with any of the seqs above that I tested. I don't know what mmseq tool they use though.

xinyu-dev avatar Dec 03 '24 17:12 xinyu-dev

same OOM issue, I'm using version 0.3.0 which is supposed to resolve the memory issue..... The prediction is OOM failed on L4 GPU with 24GB memory (g6.4xlarge)

It works when there are 2 input protein sequences, but failed on 3 input protein sequences.... Each of the input sequences is ~ 110-150 aa

RJWANGbioinfo avatar Dec 04 '24 03:12 RJWANGbioinfo

We just released v0.3.2 which should address some of these issues. You can update with pip install boltz -U When testing, please remove any existing output folder for your input and run again! Please let us know.

jwohlwend avatar Dec 04 '24 20:12 jwohlwend

I just installed (24-12-18) via pip install the current version and I'm running into the same memory issue on a RTX 2080 Titan with 11 GB and a AMD Threadripper 64 GB. Protein A 427 aa, Protein B 135 aa , Protein C 109 aa

Using only B & C I get 9.3 GB memory usage on the GPU and it works fine. Is that amount of memory used by this task normal?

jubosch avatar Dec 18 '24 19:12 jubosch

Hi @jubosch yes unfortunately with an 11 GB GPU you will only likely be able to fold relatively small systems

gcorso avatar Dec 27 '24 16:12 gcorso

Can it be distributed over two GPUs with then 22GB? Jürgen


Jürgen Bosch, PhD, MBA Center for Global Health & Diseases Case Western Reserve University

https://www.linkedin.com/in/jubosch/

CEO & Co-Founder at InterRayBio, LLC

On Dec 27, 2024, at 11:32, Gabriele Corso @.***> wrote:



Hi @jubosch https://github.com/jubosch yes unfortunately with an 11 GB GPU you will only likely be able to fold relatively small systems

— Reply to this email directly, view it on GitHub https://github.com/jwohlwend/boltz/issues/83#issuecomment-2563854490, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQHCBLV2IIAAFBRTOZDQTCL2HV6JLAVCNFSM6AAAAABS555W5CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRTHA2TINBZGA . You are receiving this because you were mentioned.Message ID: @.***>

jubosch avatar Dec 27 '24 17:12 jubosch

Unfortunately the code does not support distributing individual complexes across GPUs yet

gcorso avatar Dec 27 '24 18:12 gcorso

@gcorso apologies if I missed it but is there away to run the set-up described here (multimer yaml) but with already computed MSAs that come from the colab_search output (=local colabfold installation)?

gieses avatar Jan 30 '25 13:01 gieses