gpu4pyscf icon indicating copy to clipboard operation
gpu4pyscf copied to clipboard

Benchmark on a10g and h100

Open AlexanderMath opened this issue 1 year ago • 6 comments

Amazing work. Thanks for the benchmarks on A100 and V100. Wondering if anyone tried a10g and h100s?

AlexanderMath avatar Jun 18 '24 07:06 AlexanderMath

Is double precision card faster than single precision card? Will the A100 be faster than the L40 for gpu4pyscf? If I want to purchase a new GPU, should I consider A100 or V100 over L40 or even the RTX4090/5090?

aris1978 avatar Feb 21 '25 03:02 aris1978

Currently, GPU4PySCF is running faster on A100 over L40 in general. For certain algorithms such as density fitting, A100 can be 10x faster than L40. But the consumer grade GPU (RTX4090/RTX5090) is much cheaper.

wxj6000 avatar Feb 21 '25 05:02 wxj6000

Can GPU4PySCF runs on more than 1 GPU?

aris1978 avatar Feb 24 '25 01:02 aris1978

@aris1978 Yes, you can run it on a multi-GPU system. This feature is still in experimental. We are still in the progress of improving the performance.

wxj6000 avatar Feb 24 '25 02:02 wxj6000

In the multi-gpu system, will the VRAM be additive? (for example, two GPUs with 16GB of VRAM will be able to run jobs that require 32GB of VRAM on a GPU)

aris1978 avatar Mar 04 '25 00:03 aris1978

@aris1978 Yes, the large intermediate variables will be distributed across multi-GPUs. This feature is still experimental. Please let us know if you find any issue.

wxj6000 avatar Mar 05 '25 04:03 wxj6000