alphafold
alphafold copied to clipboard
Minimum system specifications for standalone operation?
I've got two thirds of the workflow described in the RFDiffusion paper from David Baker's lab up and running on my home PC. I've deployed RFDiffusion itself, and ProteinMPNN. I would like to add AlphaFold.
I have concerns about whether AlphaFold will run at all on my PC. I have read some crash reports, so I am looking through the AlphaFold documents for a minimum system specification. I haven't found one.
The closest thing I found in the documentation shows one working cloud PC configuration, and some performance data. Using a rather formidable looking virtual PC instance at Google Cloud, some fairly impressive inference speeds are achieved.
I would be happy to run 50 times slower than the rates shown, but I want to be certain that I can submit a 300-residue protein without causing a system crash.
What are the "must haves" for AlphaFold operation? I'll be happy to purchase a 4 TB SSD. I can't afford an A100 GPU. I don't know exactly how to compare the cloud computing specs to the specs for a standalone machine.
Thanks for any information you can provide.
I don't know about actual minimum requirements, but I do know that the hardware in a p3.2xlarge AWS instance had no such issues for proteins much larger than that. The most common hardware related cause of crashes I have seen on this repository is by far and away GPU memory, so I would say if you have a dedicated GPU with 16+GB you should be fine, and for proteins of the size you are talking about even 8 GB is likely adequate. The closest thing to minimum requirements I have seen is in the README, where it talks about the reduced_dbs version being designed for 8 CPU cores and 8 GB of RAM. I would expect headaches if you had less than 8 GB RAM for either the GPU or CPUs, and a massive speed loss if you did not have a dedicated GPU.
Also, you probably want to make sure you are using an NVIDIA GPU
Thanks for your replies, @tcoates5.
I've done a fair amount of machine learning work. At this point, I think that even a casual machine learning user won't try to make things work without an NVidia GPU and CUDA.
I have an NVidia 1660 GTX Super with 6 GB of video RAM. This GPU is being used by RFDiffusion and ProteinMPNN, but you're suggesting that the RAM might not be adequate for AlphaFold. I do want the option to use the full AF database, so I might have to upgrade the GPU. If there's a way to add RAM to an existing GPU card, I am unaware of it. The 4060 Ti with 16 GB RAM is at the upper end of my upgrade budget, but it's nowhere near as expensive as an A100.
I hope some other people can share their experiences as well.
Happy to help! I would recommend trying it out as soon as you have adequate SSD storage, I looked back at some other runs I did and it looks like I have run proteins double that size on 8GB video cards without crashes.