PaddleFleetX icon indicating copy to clipboard operation
PaddleFleetX copied to clipboard

Set flag FLAGS_enable_cublas_tensor_op_math True in default.

Open GhostScreaming opened this issue 3 years ago • 1 comments

When FP32 and FP16 model runs on A100 machine, it can be accelerated using TensorCore. Although NVIDIA declares that fp32 computation will be transferred to TensorCore automatically on A100, we detect that it's not the case. As a result, we set flag FLAGS_enable_cublas_tensor_op_math True in default manually.

GhostScreaming avatar Nov 08 '22 08:11 GhostScreaming

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

CLAassistant avatar Sep 28 '24 07:09 CLAassistant