gpu4pyscf
gpu4pyscf copied to clipboard
Benchmark on a10g and h100
Amazing work. Thanks for the benchmarks on A100 and V100. Wondering if anyone tried a10g and h100s?
Is double precision card faster than single precision card? Will the A100 be faster than the L40 for gpu4pyscf? If I want to purchase a new GPU, should I consider A100 or V100 over L40 or even the RTX4090/5090?
Currently, GPU4PySCF is running faster on A100 over L40 in general. For certain algorithms such as density fitting, A100 can be 10x faster than L40. But the consumer grade GPU (RTX4090/RTX5090) is much cheaper.
Can GPU4PySCF runs on more than 1 GPU?
@aris1978 Yes, you can run it on a multi-GPU system. This feature is still in experimental. We are still in the progress of improving the performance.
In the multi-gpu system, will the VRAM be additive? (for example, two GPUs with 16GB of VRAM will be able to run jobs that require 32GB of VRAM on a GPU)
@aris1978 Yes, the large intermediate variables will be distributed across multi-GPUs. This feature is still experimental. Please let us know if you find any issue.