apex icon indicating copy to clipboard operation
apex copied to clipboard

compiled to a version runs too slow

Open terU3760 opened this issue 3 years ago • 2 comments

Have successfully compiled a version of apex on A100 hardware. But when running the test of fmha. It took 21 seconds to finish. What could be the cause?

terU3760 avatar Aug 06 '22 20:08 terU3760

I don't know your setup but one local run was as follows

root@a33ccb515d34:/opt/pytorch/apex# python apex/contrib/test/fmha/test_fmha.py
Test s=128 b=32, zero_tensors=False
Test s=128 b=32, zero_tensors=True
.Test s=256 b=32, zero_tensors=False
Test s=256 b=32, zero_tensors=True
.Test s=384 b=32, zero_tensors=False
Test s=384 b=32, zero_tensors=True
.Test s=512 b=32, zero_tensors=False
Test s=512 b=32, zero_tensors=True
Test s=512 b=2, zero_tensors=False
Test s=512 b=2, zero_tensors=True
Test s=512 b=3, zero_tensors=False
Test s=512 b=3, zero_tensors=True
.
----------------------------------------------------------------------
Ran 4 tests in 1.637s

OK

crcrpar avatar Aug 09 '22 17:08 crcrpar

Hi, thank you for your reply. I don't understand what configuration you have applied. But under my configuration, I just use CUDA 11.4 and anaconda to successfully build apex from source on A100 and found the result is:

(......) ...@...:~/apex/apex/contrib/test/fmha$ python test_fmha.py
Test s=128 b=32
.Test s=256 b=32
.Test s=384 b=32
.Test s=512 b=32
Test s=512 b=2
Test s=512 b=3
.
----------------------------------------------------------------------
Ran 4 tests in 23.213s

OK

Could you please explain this for me?

terU3760 avatar Oct 01 '22 17:10 terU3760